Search our FAQs. Or, ask a new question!
Browse:
As of August 7, 2024, Baker Library's FAQs have moved to https://www.library.hbs.edu/services/help-center. Please visit Baker's Help Center to find our research FAQs and send us your questions. Thank you!
Baker Library has licensed the following newspapers for data mining from ProQuest. Currently, the newspapers are available on hard drives. For information, please contact Alex Caracuzzo acaracuzzo@hbs.edu.
Harvard affiliates may also want to explore ProQuest TDM Studio, a tool that allows you to mine large volumes of published content.
Newspaper Title | Years of XML/PDF Articles | Articles-Level vs. Page-level |
---|---|---|
Atlanta Constitution | 1868-1930 (XML only) | TBD |
Austin American Statesman | 1871-1926 | all years article-level |
The Baltimore Sun | 1837-1932 | all years article-level |
The Boston Globe | 1872-1987 | all years article-level |
Chicago Tribune | 1849-1935 | all years article-level |
The Christian Science Monitor | 1908-1995 | all years article-level |
The Cincinnati Enquirer | 1841-2009 | 1841-1922 article-level; 1923-2009 page-level |
Dayton Daily News | TBD | TBD |
Detroit Free Press | 1831-1999 | 1931-1922 article-level; 1923-1999 page level |
Hartford Courant | 1764-1934 | all years article-level |
Los Angeles Times | 1881-1950 | all years article-level |
Louisville Courier-Journal | 1830-2000 | 1830-1922 article-level; 1923-2000 page-level |
Nashville Tennessean | 1812-2002 | 1812-1922 article-level; 1923-2002 page-level |
The New York Times | 1851-1933 (XML only) | TBD |
New York Tribune/Herald Tribune | 1841-1962 | all years article-level |
Newsday | 1940-1990 | all years article-level |
Philadelphia Inquirer | 1860-2001 | all years page-level |
San Francisco Chronicle | 1865-1922 | all years article-level |
St. Louis Post-Dispatch | 1874-2003 | 1874-1922 article-level; 1923-2003 page-level |
Wall Street Journal | 1889-1932 (XML only) | TBD |
Washington Post | 1877-1937 | TBD |
The Harvard Kennedy School also has a guide on resources available for texting mining.
Text Analysis Tools
NVivo - https://library.harvard.edu/services-tools/nvivo
MALLET - http://mallet.cs.umass.edu/
Voyant Tools - http://voyant-tools.org/
Computational Literature Review (clR) - https://github.com/rvidgen/clr
Google n-gram https://books.google.com/ngrams
Natural Language Toolkit (Python) - http://www.nltk.org/
Stanford CoreNLP - https://stanfordnlp.github.io/CoreNLP/index.html#download
Was this helpful? 0 0
Copyright © 2022 President & Fellows of Harvard College.