Its composition was designed to match the original Brown corpus in terms of its size and genres as closely as possible using documents published in the UK in 1961 by British authors.[1] Both corpora consist of 500 samples each comprising about 2000 words in the following genres:
Label
Text category
Brown Corpus
LOB Corpus
A
Press: reportage
44
44
B
Press: editorial
27
27
C
Press: reviews
17
17
D
Religion
17
17
E
Skills, trades and hobbies
36
38
F
Popular lore
48
44
G
Belles lettres, biography, essays
75
77
H
Miscellaneous (documents, reports, etc.)
30
30
J
Learned and scientific writings
80
80
K
General fiction
29
29
L
Mystery and detective fiction
24
24
M
Science fiction
6
6
N
Adventure and western fiction
29
29
P
Romance and love story
29
29
R
Humour
9
9
Total
500
500
The corpus has been also tagged, i.e. part-of-speech categories have been assigned to every word.[2]