Post

ZacharyST

@ZacharyST

Sep 3, 2018 • 6 tweets • 4 min read • Read on X

1/n. Saving all the cool new data sources and tools I learned about at #APSA2018.

Cliff-Clavin for geolocating newspaper articles: cliff.mediacloud.org. Though Mordecai will eventually replace it for automated event data: github.com/openeventdata/….

#terrier, the new and improved Phoenix (automated events data), terrierdata.org. See more at: osf.io/4m2u7/.

@CommonCrawl

@CommonCrawl (commoncrawl.org), 250tb of crawled web pages per month for 7 years, available for free. Tons of documentation and examples. Even better, there is a separate archive of news text (1,000 sources from @DMOZ, en.wikipedia.org/wiki/DMOZ).

#babelnet, multinlingual dictionary and semantic network. Rate limited, but possibly willing to work with academics. babelnet.org/about

@mySociety

everypolitician (everypolitician.org), from @mySociety, is data on national legislators from what appears to be literally every country. Includes email address, social media handles, party, district ID, gender, duration in power.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

Share this page!

Enter URL or ID to Unroll

ZacharyST

Try unrolling a thread yourself!

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!