Trying to learn Hadoop? You’re not alone. 🙂 Big Data and Hadoop are some of the hottest topics in the tech industry right now. This is great news for us, because that means there is a lot of training options and community-driven content available to get up to speed on these technologies. To help, I’ve compiled this list of free Hadoop resources, which I will keep updated as I discover new content.
Big Data & Hadoop Fundamentals
- Big Data University – a collection of mostly-free courses on Big Data-related technologies
- Big Data Fundamentals – 1.5 hour self-paced course. Provides a certificate of completion.
- Hadoop Summit 2015 – the leading conference for the Apache Hadoop community
- Slideshare – Slides from all conference sessions
- YouTube Playlist – video recordings of all conference sessions
- Hortonworks – company behind one of the major Hadoop distributions
- Hortonworks Sandbox – Excellent hands-on, tutorial-based learning
- Lynn Langit – a consultant, trainer, and prominent member of the data community
- Hadoop MapReduce Fundamentals (first video of a 5-part series)
- MapR – company behind one of the major Hadoop distributions
- The Hadoop Ecosystem Table – provides a great overview of all the technologies that make up the Hadoop ecosystem
Apache Hive
- Apache Hive – the official site for Apache Hive
- Big Data University – a collection of mostly-free courses on Big Data-related technologies
- Cloudera – company behind one of the major Hadoop distributions
Apache Pig
- Apache Pig – the official site for Apache Pig
- Big Data University – a collection of mostly-free courses on Big Data-related technologies
- Cloudera – company behind one of the major Hadoop distributions
- Hyperpolyglot – provides side-by-side reference sheets for various languages
- Lynn Langit – a consultant, trainer, and prominent member of the data community
- Mortar Data – a cloud Hadoop company that has since been acquired by Datadog
- Learn Pig – hands-on tutorial
- Pig Cheat Sheet – excellent PDF guide of Pig syntax and examples
- SQL=>Pig Cheat Sheet – shorter version of the Pig Cheat Sheet
- O’Reilly
- Programming Pig – the original, definitive guide to programming in Pig; now available for free
- Pig-Eye for the SQL Guy – blog that compares and contrasts Pig with SQL
Apache Spark
- Apache Spark – the official site for Apache Spark
- Big Data University – a collection of mostly-free courses on Big Data-related technologies
- TypeSafe
Apache Sqoop
- Big Data University – a collection of mostly-free courses on Big Data-related technologies
- Strata+Hadoop World – a prominent Hadoop-focused conference
Linux & Miscellaneous
- Bash Cheat Sheet
- RegEx Cheat Sheet
- VIM Adventures – freemium, game-based learning for text editing in VIM
- KhanAcademy: Intro to SQL – great interactive tutorial, may help with learning Hive too
Did I miss something? Leave a comment and I’ll be happy to add it! 🙂