Coding

Completed tasks and plans for learning Software Engineering/Data Science

September

Week 3: Bash practice

  • [ ] Bash data analytics - Data36
  • [ ] Learn
    • [ ] GCP
    • [ ] Azure
    • [ ] AWS
    • [ ] Spark
  • [ ] Python gaps
    • [ ] Data Camp - Intro to Python fundamentals
    • [ ] classes
  • [ ] Thredup planning
    • [ ] What data am I extracting?
    • [ ] How do I want to build a recommendation model?

Week 2: SQL + job applications

  • [x] apply to jobs + internships
  • [x] SQL refresher - Data36
  • [x] Leetcode practice - Python
  • [x] ODSC scholarship application
  • [x] ODSC registration for all events

Week 1: Thredup database

Plan

next month

  • [ ] classification models (most popular):

    • [ ] support vector machine (SVM)
    • [ ] logistic regression
    • [ ] decision trees
    • [ ] random forest
    • [ ] XGboost
    • [ ] convolutional neural network
    • [ ] recurrent neural network
  • [ ] Python functions, classes, data structures

  • [ ] Python Data Science Handbook - understand + memorize

  • [ ] copy links to "Migrating to Linux" page

  • [ ] scan notes + fill in mistakes with audio?

  • [ ] DS assignment - now with linear regression

  • [ ] Titanic project

  • [ ] Lambda School - follow curriculum

  • [ ] ODSC mini-bootcamp - follow curriculum

  • [ ] Hack Reactor student projects

  • [ ] program to manipulate MIDI files with python to produce a new sound

  • [ ] take a picture of a bomber jacket in store, thredup will find an alternative on it's website. Maybe poshmark as well?

  • [ ] Migrate over to networked-thought tool prior to Mid-year review?

  • [ ] json - for debugging

  • [ ] Possible to contribute to Athens?

Short term - end of 2020

  • [ ] Daniel Bourke’s youtube video - Titanic Kaggle project
  • [ ] Complete Codebasics playlist
    • [ ] Different models + data cleaning/manipulating
  • [x] How to upload and pull from Github using command line?
  • [x] Resume
  • [x] Cover letters
  • [ ] Fix all code in Jupyter notebooks + colab notebooks + add more comments
    • [ ] run additional ML models, improve score
  • [ ] Start a side project
  • [ ] Airbnb app from kickstarter?
  • [ ] python library for thredup and poshmark?
  • [ ] Markup guidelines for github + jupyter for asthetics
  • [ ] Digitize ML mind map
    • [ ] visual
    • [ ] detailed explanations for overview
  • [ ] CS or data structures class
  • [ ] Statistics - Statquest
  • [ ] Linear Algebra - 3blue1brown
  • [ ] Sklearn + Tensorflow - Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
  • [ ] Command Line - Data36
  • [ ] look at what profiles from "so good they can't ignore you"
  • [ ] Python basics - go through the whole list
  • [ ] David Cournapeau - created Scikit-learn as a Google Summer of Code
  • [ ] Archaeopteryx - dance AI software but for a different genre?
    • [ ] using cakewalk's software to alter music?
    • [ ] software that uses midi files as input and outputs an altered file?
  • [ ] Mix music genres together to generate a song?
  • [ ] Turns a picture into a paint by numbers, colored pencils, or just for sketching
  • [ ] Algorithms:
    • [ ] recommend books you normally don't read. Will then provide a sample. History + politics = science fiction?
  • [ ] Carbon footprint comparison: shipping a garment vs. buying a new one from a store. How many times can it be shipped for it not be valid?
  • [ ] grant finder for research universities? How much time can it save them? Fill out a majority of the forms as well?
  • [ ] ML for networked thought
    • [ ] Roam - reverse engineer
    • [ ] Obsidian - Roam alternative
    • [ ] ClickUp - use API to test
    • [ ] Notion - no API but use Lotion's API?
  • [ ] ML to delete duplicates on Google Photos

Source: [Jason Benn's article](Everything you need to become a self-taught Machine Learning Engineer)

Basic programing - make sure foundations are covered

  • Make something similar for the Odin Project but for Data Science

Learn:

  • Data Visualization: Tableu, Apache Spark
  • Hadoop
  • Databases: SQL
  • Cloud environments: AWS, Azure, Kubernetes
  • web service technologies: JSON
  • Thredup scraper - remove favorites

Long term - 2021

Teach Yourself Computer Science - 2-3 years

Bradfield CS courses ~ 5 weeks long per course - paid version of TYCS

Dive into Machine Learning

ML curriculum (designed by Jason Benn)

fast.ai - part 1 and 2 (60 hours total)

Internship - Ideally 3 months long

Job - Mid to late 2021

What field do I want to work in?

  • sustainability
  • efficiency
  • community engagement
  • local government
  • education
  • health
  • food
  • games?
  • art
  • music

What skills do I want?

  • use various types of models and software
  • lots of Python + an additional language

What am I looking for in a company?

  • small size (less than 100?) - has a startup field
  • casual setting - not meeting clients all the time, not formal wear
  • focus on work/life balance
  • located in a cool location - city or near lots of nature (mountains or beach)
  • pro-bono work - on the side or as main work
  • accelerated learning especially from peers

Done

August

Week 2-4: Thredup Database

  • created a kanban board - all todos have migrated to the github project page
  • Updated README file on Github for latest todo

Week 1: Thredup Database

  • Thredup project
    • clean-up Obsidian notes
    • create a copy of a Github folder that syncs md file with Obsidian's file (don't want to keep copies at two locations
    • push to Github (code + md notes)
  • Anki
    • Git cards - push, pull, etc.
  • Website
    • fix site
    • ideas: menu, posts, layout
  • Org-Roam
    • Test out 2nd Org-Roam installation on separate conda environment
    • Guide to Installation: complete
  • Linux
    • Anki
    • conda env for thredup project
    • update migration to Linux page

July

Week 5: Thredup Database

  • Thredup up project
    • finish all functions
    • rest of the functions to be used later - also worked on them
  • Dual Boot Laptop - oh boy!
  • Figure out environments: conda and homebrew
    • Experiment with installations with homebrew, then conda on other Linux distro - didn't use conda

Week 4: Thredup database

  • Thredup Webscrape:
    • links
      • combine links with href header: thredup.com
    • split urls for different clothing categories
    • realized: cannot scrape for categories within "all petite" items
    • multiple functions
  • DS setups + those who do both DS + side projects

Week 3: Debug + Environments

  • Work on Thredup Page
    • ideas: entire python library + build database using all petite items
    • web scrapper for pulling non-polyester items stopped working
    • web scrapper for pulling 100% linen items - use database instead
    • web scrapper for building a database start
      • pull links for the main pages - won't work with bs4, due to website layout. Not all href's are able to be pulled + random hrefs for products get pulled. Need to you XML formatting
        • yank all hrefs/page - not anymore
        • selenium to move to the next pages - not anymore
  • Get Doom Emacs
  • Purchase computer accessories for working at home
  • Acer One laptop:
    • attempt clean-up
    • wipe out system

Week 2: Anki + Feynman technique

  • Fix thredup code
  • move over any "Spotify Project" related files from host to guest OS
  • move over simplified version of Digital Ocean files
  • debugging
  • Digital Ocean:
    • sort through all files
    • organize
    • download
    • delete - will cancel credit card
  • LinkedIn - update all work description
  • Feynman | DS concepts for Classification:
    • AUC curve
  • Feynman | DS concepts for Regression
  • create a map for all concepts so far + tangents to see what directions I've been going in
  • start Thred-up ML project - 1 hour
  • start community involvement project
  • Anki: map, filter, reduce and lambda:
    • definition
    • syntax
    • example

Week 1 (3 days): Anki + Feynman

  • Anki | python code in DS | 20 cards
  • Feynman | DS concepts for Classification:
    • Confusion Matrix
    • Accuracy
    • Precision & Recall
    • F1 Score
    • Harmonic Mean (from F1 score)
    • F$\beta$-score (from F1 score)
    • Sensitivity & Specificity
    • ROC Curve

June

Week 5 - 2 days:

  • Ultralearning review + update (mid year evaluation)
  • Notes | scikit learn

Week 4:

  • download RESULTER extension for google search shortcuts
  • increase storage again - new way for Ubuntu 20.04
  • reinstall Ubuntu after expanding hard drive mistake
  • VS code: basic debugging
  • VS code won't run Python - how to enable?
  • access shared folder between host and guest systems
  • Ideas vomit for open source projects: education (The Prize), government, businesses (Amazon → buying locally) Open Source Projects
  • Will Athens be using Emacs? - nope
  • change default environment to DJ-set - don't think you can

Week 3: Set up local machine - Ubuntu setup + Remixatron

  • delete XUbuntu
  • Install Ubuntu
  • Install VirtualBox Guest Additions - automatically adjust resolution (to match host) + shared clipboard and drag and drop between host and guest systems
  • Slow Ubuntu - wrong ISO mounted, reinstalled just incase
  • Install miniconda + conda onto Virtual Box
  • create environment
  • Organize environments + GitHub folders problem arose from VS Code not being able to find pydub (installed in a different folder using conda than where pip was installing)
  • no longer use Spotify's API
  • run Remixatron on command line - won't run because of pygame
  • embed client ID into jupyter notebook
  • audio analysis (audio from youtube) using Spotify's API
  • identify beat breakdown for Daechiwita + DNA. Didn't work since I was downloading music from YouTube and using Spotify's analysis on it
  • switch from jupyter lab to VS Code VS code can run notebooks!
  • widget for jupyter notebook to play 2 audio samples (downloaded from youtube?)

Week 2: Research into 'Daechwita —> DNA' project + VMs (using Linux fully)

  • created separate page for the 'Daechwita —> DNA' project
  • beats and tempos match! extracted from dict in analysis
  • most code uses javascript with html + css (makes sense) with the spotify API
  • discovered lots of resources, for python as well
  • The Autocanonizer source: youtube
  • Spotify audio analysis source: spotify track plays locally on desktop while song bar progresses on the web browser
  • EternalJukebox - better README notes + is more updated than autocanonizer
  • Youtube-dl - command-line program to download videos from YouTube.com and a few more sites
  • Discovered Remixatron on AlternativeTo
    • forced myself to learn git to download and play around on the command line
  • Virtual box vs. SSH?
  • Remixatron is giving array allocation errors. Code it myself!

Week 1: Led Zeppelin Project

  • Led Zeppelin project - documentation
  • Led Zeppelin project - define objective
  • pull + sort albums from given artist

May

Week 4: Blog posts + interview questions

  • Interview questions - 1 hour per day
  • Data Science assignment
  • How to solidify Python?

Week 3: Job applications

  • redirect URL to notion or redirect from WordPress site itself
  • Create word/pdf resume
  • search for jobs
  • cover letters
  • job applications

Week 2: Online presence

Week 1: Finish Python section of course

  • Python videos - basically finishes up the whole course

April - back home in Quarantine

March - Sri Lanka

February - Bangalore

2019

  • Python - Automate the boring stuff (first half of Udemy course)
  • The Odin Project - started it off
  • Data 36 - SQL practice
  • Codewars - python + SQL practice
  • SQL - interview questions

NOTES

Original Plan in Google Drive:

Data Science - Ultra learning Project