-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathICL_results.txt
More file actions
179 lines (156 loc) · 3.79 KB
/
ICL_results.txt
File metadata and controls
179 lines (156 loc) · 3.79 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
Meta llama model:
0-shot:
Validation accuracy: 0.3438
Validation macro-F1: 0.3366
Test (ID) accuracy: 0.3212
Test (ID) macro-F1: 0.3164
Test (OOD) accuracy: 0.4545
Test (OOD) macro-F1: 0.4167
1-shot:
Validation accuracy: 0.2604
Validation macro-F1: 0.2143
Test (ID) accuracy: 0.2435
Test (ID) macro-F1: 0.2170
Test (OOD) accuracy: 0.1818
Test (OOD) macro-F1: 0.1250
2-shot:
Validation accuracy: 0.2448
Validation macro-F1: 0.2560
Test (ID) accuracy: 0.2902
Test (ID) macro-F1: 0.3103
Test (OOD) accuracy: 0.3636
Test (OOD) macro-F1: 0.3077
3-shot:
Validation accuracy: 0.3177
Validation macro-F1: 0.2741
Test (ID) accuracy: 0.2902
Test (ID) macro-F1: 0.2478
Test (OOD) accuracy: 0.3636
Test (OOD) macro-F1: 0.2692
4-shot:
Validation accuracy: 0.2083
Validation macro-F1: 0.2243
Test (ID) accuracy: 0.2073
Test (ID) macro-F1: 0.1983
Test (OOD) accuracy: 0.5455
Test (OOD) macro-F1: 0.4615
5-shot:
Validation accuracy: 0.2188
Validation macro-F1: 0.1876
Test (ID) accuracy: 0.2798
Test (ID) macro-F1: 0.2608
Test (OOD) accuracy: 0.5455
Test (OOD) macro-F1: 0.4444
6-shot:
Validation accuracy: 0.2240
Validation macro-F1: 0.1879
Test (ID) accuracy: 0.2176
Test (ID) macro-F1: 0.2120
Test (OOD) accuracy: 0.3636
Test (OOD) macro-F1: 0.2262
7-shot:
Validation accuracy: 0.2552
Validation macro-F1: 0.2605
Test (ID) accuracy: 0.2021
Test (ID) macro-F1: 0.1641
Test (OOD) accuracy: 0.5455
Test (OOD) macro-F1: 0.4722
8-shot:
Validation accuracy: 0.2344
Validation macro-F1: 0.1745
Test (ID) accuracy: 0.2124
Test (ID) macro-F1: 0.1968
Test (OOD) accuracy: 0.4545
Test (OOD) macro-F1: 0.4167
9-shot:
Validation accuracy: 0.1875
Validation macro-F1: 0.1480
Test (ID) accuracy: 0.1451
Test (ID) macro-F1: 0.0847
Test (OOD) accuracy: 0.2727
Test (OOD) macro-F1: 0.2308
10-shot:
Validation accuracy: 0.1979
Validation macro-F1: 0.1546
Test (ID) accuracy: 0.2176
Test (ID) macro-F1: 0.1920
Test (OOD) accuracy: 0.1818
Test (OOD) macro-F1: 0.1389
Mistral model:
0-shot:
Validation accuracy: 0.6198
Validation macro-F1: 0.6140
Test (ID) accuracy: 0.6528
Test (ID) macro-F1: 0.6446
Test (OOD) accuracy: 0.5455
Test (OOD) macro-F1: 0.4048
1-shot:
Validation accuracy: 0.7500
Validation macro-F1: 0.6958
Test (ID) accuracy: 0.7565
Test (ID) macro-F1: 0.7090
Test (OOD) accuracy: 0.8182
Test (OOD) macro-F1: 0.7222
2-shot:
Validation accuracy: 0.7188
Validation macro-F1: 0.7284
Test (ID) accuracy: 0.7668
Test (ID) macro-F1: 0.7204
Test (OOD) accuracy: 0.7273
Test (OOD) macro-F1: 0.6154
3-shot:
Validation accuracy: 0.7552
Validation macro-F1: 0.7433
Test (ID) accuracy: 0.7772
Test (ID) macro-F1: 0.7358
Test (OOD) accuracy: 0.8182
Test (OOD) macro-F1: 0.6923
4-shot:
Validation accuracy: 0.7396
Validation macro-F1: 0.7302
Test (ID) accuracy: 0.7720
Test (ID) macro-F1: 0.7087
Test (OOD) accuracy: 0.6364
Test (OOD) macro-F1: 0.4762
5-shot:
Validation accuracy: 0.7656
Validation macro-F1: 0.7594
Test (ID) accuracy: 0.8135
Test (ID) macro-F1: 0.7618
Test (OOD) accuracy: 0.8182
Test (OOD) macro-F1: 0.7222
6-shot:
Validation accuracy: 0.7917
Validation macro-F1: 0.7782
Test (ID) accuracy: 0.8342
Test (ID) macro-F1: 0.7847
Test (OOD) accuracy: 0.8182
Test (OOD) macro-F1: 0.7222
7-shot:
Validation accuracy: 0.7760
Validation macro-F1: 0.7660
Test (ID) accuracy: 0.8031
Test (ID) macro-F1: 0.7512
Test (OOD) accuracy: 0.8182
Test (OOD) macro-F1: 0.6923
8-shot:
Validation accuracy: 0.8073
Validation macro-F1: 0.7980
Test (ID) accuracy: 0.8394
Test (ID) macro-F1: 0.7893
Test (OOD) accuracy: 0.8182
Test (OOD) macro-F1: 0.7222
9-shot:
Validation accuracy: 0.7865
Validation macro-F1: 0.7762
Test (ID) accuracy: 0.8290
Test (ID) macro-F1: 0.7755
Test (OOD) accuracy: 0.8182
Test (OOD) macro-F1: 0.7222
10-shot:
Validation accuracy: 0.7969
Validation macro-F1: 0.7882
Test (ID) accuracy: 0.8135
Test (ID) macro-F1: 0.7436
Test (OOD) accuracy: 0.8182
Test (OOD) macro-F1: 0.7222