Python Data Mining Resources
Python for
data mining has been gaining some interest from data miner community
due to its open source, general purpose programming and web scripting
language. Below are some resources to kick start doing data mining using
Python:
- The Python Tutorial (updated ver 2.7) – read this manual first!
- Orange - an open source data visualization/data analysis/data mining through visual programming or Python scripting.
- Scrapy - fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from web pages, completely written in Python. Use this to get any raw data from any site for data mining purposes.
- Data Mining in Python – a collection of libraries useful for machine learning and data mining especially in clustering and supervised learning.
- Pattern - a web mining module for the Python programming language. It contains tools for data retrieval, text analysis and data visualization and comes with over 30 sample scripts.
- Data Extractions Tools written in Python
- Information Retrieval with Python
Scientific Computing:
- numpy - numerical library, numpy.scipy.org/
- scipy - Advanced math, signal processing, optimization, statistics,www.scipy.org/
- matplotlib, python plotting - Matplotlib, matplotlib.org
- NetworkX, for graph analysis, networkx.lanl.gov/
- Orange, Data Mining Fruitful & Fun, biolab.si
- pandas, Python Data Analysis Library,pandas.pydata.org
- pybrain, pybrain.org
- NLTK, Natural Language Toolkit, nltk.org
You may also like to know: Machine Learning With Python
Post a Comment