👩‍💻 DynamicWebPaige @ 127.0.0.1 🏠's Threads

Oct 8, 2018 • 7 tweets • 5 min read

TIL:

💃Good folks at @UniofOxford tagged+categorized pose categories in several episodes of "Buffy the Vampire Slayer"

📃"2D pose estimation in TV shows" is a body of academic work

🙌I've a heretofore unrealized desire to determine which BuffyPose has the highest % of frames

✨🧠

robots.ox.ac.uk/~vgg/data/buff…

(1) Train a model on the annotated example BuffyPoses;

(2) have it cycle through every episode of every season of "Buffy";

(3) determine BuffyPose with the greatest percentage of frames;

(4) don't forget to take ample time in S01 and S02 to:

Sep 27, 2018 • 14 tweets • 6 min read

📓 Am rereading my class notes from grad school, as well as from mentoring students for @Coursera and @EdX courses on statistics - and thought I'd share the most common mistakes when doing data analysis.

✨Have counted 8 of 'em, with examples - please feel free to add your own! MISTAKE #1:
Garbage in, garbage out.

🤦‍♀️Failing to investigate your input for data entry or recording errors.

📊Failing to graph data and calculate basic descriptive statistics (mean, median, mode, outliers, etc.) before analyzing it in-depth.

Sep 18, 2018 • 11 tweets • 6 min read

🗣Some recommendations for budding machine learning engineers:

(1) Make sure your sample dataset is representative of your entire population - and remember that more data is usually - but not necessarily! - better.

Also consider using image preprocessing tools, like Augmentor.

(2) Use small, random batches to train rather than the entire dataset.

⏳Reducing your batch size increases training time; but it also decreases the likelihood that your optimizer will settle into a local minimum instead of finding the global minimum (or something closer to it).

Apr 12, 2018 • 9 tweets • 7 min read

1) @ProjectJupyter Extension of the Day: Spellchecker!

This #nbextension uses a @CodeMirror overlay mode to highlight incorrectly-spelled words in Markdown and Raw cells. The typo.js library does the actual spellchecking, and is included as a dependency.

…r-contrib-nbextensions.readthedocs.io/en/latest/nbex…

.@ProjectJupyter Extension of the Day #2: Codefolding!

This extension adds codefolding functionality from @CodeMirror to each code cell in your notebook. The folding status is saved in the cell metadata, so reloading a notebook restores the folded view.

…r-contrib-nbextensions.readthedocs.io/en/latest/nbex…

Feb 12, 2018 • 28 tweets • 18 min read

Inspired by the big ol' long list of deep learning models I saw this morning, and @SpaceWhaleRider's love of science-y A-Z lists, I've decided to create an A to Z series of tweets on popular #MachineLearning and #DeepLearning methods / algorithms.

Ready? Here we go:

A is for... the Apriori Algorithm!

Intended to mine frequent itemsets for Boolean association rules (like market basket analysis). Ex: if someone purchases the same products as you, in general, then you'd probably purchase something they've purchased.

cran.r-project.org/web/packages/a…

Dec 1, 2017 • 8 tweets • 2 min read

So, time to drop some knowledge bombs. Most data scientists aren't taught:

- TCP/IP Protocol architectures
- how to deploy a server
- RESTful vs SOAP web services
- Linux command line tools
- the software development life cycle
- modular functions + the concept of writing tests - distributed computing
- why GPU cores are important
- client-side vs server-side scripting

..and that's just a subset. If you meet a data scientist who has familiarity with those concepts, it's because they either have a CS or IT background, or they taught themselves.

Share this page!

Enter URL or ID to Unroll