Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Text Data Mining Library Resources

Guide to accompanuy Fall 2022 Workshop

Text Data Mining Library Resources

â
038
ot
ft
ic
fo
tf
rf
ii
th
fa
ir
cf
ro
rr
fc


ia

Description Snippet
Enable all rows to print from dataframe pd.set_option('display.max_rows', None)
   

 

Resources with TDM Content and/or Support

Database Description Downloadable for Use in 3rd Party Tools? Text Processing & Cleaning Analysis & Visualization
Gale Digital Scholar's Lab Provides an on line environment to create custom content data sets from the corpus of the library’s Gale Primary Sources holdings. Users can analyze and interrogate the data with the text analysis and visualization tools built into the platform. Digital humanities analysis methods include: Named Entity Recognition, Topic Modelling, Parts of Speech, and more. Output can be published with confidence because intellectual property rights are retained and analysis outputs are free to share. Gale. YES YES YES

Newsbank/Readex

Readex, a division of NewsBank, publishes collections of primary source research materials. NO

NO

-

only provides simple stop word capabilities

YES
Nexis Data Lab A data mining and analysis tool that enables researchers to search across the Nexis news archive to create data sets efficiently and effectively in a controlled environment where they can be analyzed and visualized. It’s design is appropriate for novice or advanced data mining researchers. YES YES YES
HathiTrust Research Center (HTRC) HathiTrust Research Center (HTRC) enables computational analysis of works in the HathiTrust Digital Library (HTDL) to facilitate non-profit research and educational uses of the collection.

NO

-

but provides virtual Linux desktop

YES YES

Adam Matthew Content

(click for Baylor Holdings)

The company publishes collections of digitized primary source materials from different historical eras. For example, Empire Online covers the histories of colonial era United States, Canada, India, Australia, South Africa, and Britain. Other collection topics include gender studies, American history and consumer culture, Victorian England, Asian history, the First World War, and others.

YES

-

but requires contacting Baylor Libraries

NO NO
Project Gutenberg Provides access to freely downloadable books whose United States copyright has expired. Search by title, author, or browse by topic. Download to PC, e-readers, smart phones, and other portable devices. Includes some audio books for books whose copyright has expired, and links to download selected titles to CDs or DVDs. Mobile app available. Project Gutenberg. YES NO NO
Internet Archive (Archive.org) Internet Archive is a non-profit library of millions of free books, movies, software, music, websites, and more.

SOME

-

but not all

NO NO
NCapture NCapture is a free Chrome web-browser extension that allows you to quickly and easily capture content like web pages, online PDFs, Twitter tweets, and more. YES NO

YES

-

but only using NVivo

 

Examples of Custom TDM Tools Created at Baylor University

Loading ...

University Libraries

One Bear Place #97148
Waco, TX 76798-7148

(254) 710-6702