广州中考英语完形选项词频能否预测全国中考完形:一项语料库研究

四季读书网 1 0
广州中考英语完形选项词频能否预测全国中考完形:一项语料库研究
一、研究背景及问题
现有116套广州中考及一模题的完形填空选项词,经统计得到:
1. 单词列表。
2. 单词的频率(word frequency)。例如take的频率为100,表示作为选项出现过100次。为便于比较,将词频均标准化为百分比,如take为100/116=86.2%。
广州中考英语完形选项词频能否预测全国中考完形:一项语料库研究-第1张图片-四季读书网
另有200套全国各地中考题,经统计得到完形填空选项词单词列表及对应频率。
广州中考英语完形选项词频能否预测全国中考完形:一项语料库研究-第2张图片-四季读书网
:广州语料库的完形选项词词频,能在多大程度上预测全国中考中的完形选项词频?
二、研究过程
挑选在全国语料库中词频≥4%的单词,共454个。剔除高频虚词(如up、nobody、whether等), 剩余400个,进入统计。
三、研究发现
1. 高频选项
全国完形中,最高频的选项词为look和give(词频均为30%),其它高频词如下。
核心词
衍生词
全国词频广州词频
look
look (27); looks (4); looked (22); looking (7)
30%
53%
give
gave (21); give (23); gift (7); given (4); gifts (2); giving (3)
30%
29%
care
carelessly (8); carefully (26); careful (11); cared (3); care (5); careless (2); caring (1)
28%
43%
worry
worrying (3); worried (33); worriedly (6); worry (12)
27%
28%
surprise
surprise (14); surprised (35); surprisingly (1); surprising (1); surprises (1)
26%
31%
excite
excited (32); excitedly (11); exciting (5); excitingly (1); excitement (1)
25%
46%
anger
angry (23); angrily (20); anger (5)
24%
38%
bore
boring (19); bored (28)
24%
26%
forget
forgot (21); forget (18); forgotten (2); forgets (1); unforgettable (2); forgettable (1)
23%
12%
happy
happily (10); happy (26); unhappy (1); happiness (5); happiest (2)
22%
44%
interest
interested (12); interesting (14); interest (14); interests (2)
21%
30%
help
helped (5); help (19); helping (6); helpful (9); helps (1); helpless (1)
21%
24%
change
changes (8); change (16); changed (14); changing (2); changeable (1)
21%
17%
2. 相关性
对于全范围400个高频选项词中,两组数据的词频高度相关,相关性为0.828。即广州中考的完形选项词频信息,能很好预测全国中考的词频。
3. 差异性
3.1 词频相近的词
经计算两者词频之差(广州减去全国)可发现,两者范围相近(±5%)以内的单词共有350个,占全范围单词的87.5%,即广州单词中,很大一部分词频和全国的相近。
3.2 在全国中考更高频出现的词
结果显示,两者词频差异在5%以上的共22个,平均差异6.6%。差异最大的为forget,在全国中考出现概率为23%,而在广州仅为12%,相差10%。其它差异较大的词有:
核心词
衍生词
全国词频
广州词频
差异
forget
forgot (21); forget (18); forgotten (2); forgets (1); unforgettable (2); forgettable (1)
23%
12%
-10%
clear
clear (6); clearly (14); cleared (1)
11%
2%
-9%
promise
promises (3); promise (13); promised (8)
12%
4%
-8%
joke
joke (8); joked (4); jokes (4); joking (1)
9%
1%
-8%
start
start (10); started (9); starts (2)
11%
3%
-7%
still
still (14)
7%
0%
-7%
terrible
terribly (2); terrible (12)
7%
0%
-7%
refuse
refused (28); refuse (7); refuses (2); refusing (1)
19%
12%
-7%
3.3 在广州中考更高频出现的词
采用与3.2中的方法,可计算出在广州中考更常考察的词汇,差异在5%以上的共28个,平均差异9.8%。其中差异最大的为look,在广州中考出现概率为53%,而在全国范围仅为30%,相差23%,happy差异也较大,相差22%。其它差异较大的词有
核心词
衍生词
全国词频
广州词频
差异
look
look (27); looks (4); looked (22); looking (7)
30%
53%
23%
happy
happily (10); happy (26); unhappy (1); happiness (5); happiest (2)
22%
44%
22%
excite
excited (32); excitedly (11); exciting (5); excitingly (1); excitement (1)
25%
46%
21%
care
carelessly (8); carefully (26); careful (11); cared (3); care (5); careless (2); caring (1)
28%
43%
15%
anger
angry (23); angrily (20); anger (5)
24%
38%
14%
work
workers (2); work (11); worked (8); working (3); works (1); worker (3)
14%
27%
13%
difficult
difficult (17); difficulty (2)
10%
21%
11%
sudden
suddenly (26)
13%
24%
11%
luck
lucky (9); unluckily (6); luckily (13); luck (9); luckier (1); unlucky (1); luckiest (1)
20%
30%
10%

上一个当前已是最后一个了

下一个当前已是最新一个了

抱歉,评论功能暂时关闭!