Scrapely is a library for extracting structured data from HTML pages. Given some example web pages and the data to be extracted, scrapely constructs a parser for all similar pages.
Wikipedia is a Python library that makes it easy to access and parse data from Wikipedia. Search Wikipedia, get article summaries, get data like links and images from a page, and more. Wikipedia wraps the MediaWiki API so you can focus on using Wikipedia data, not getting it.
A HTTP/SOCKS5 tunnel for Twisted.
Lassie is a Python library for retrieving basic content from websites.
RegExpBuilder integrates regular expressions into the programming language, thereby making them easy to read and maintain. Regular Expressions are created by using chained methods and variables such as arrays or strings.
pyFileSec provides robust yet easy-to-use tools for working with files that may contain sensitive information. The aim is to achieve an "industry standard" level of privacy (AES256), capable of protecting confidential information from inspection or accidental disclosure.
A Django boilerplate application for deploying to AWS, using S3 for serving static files.
Python for android is a project to create your own Python distribution including the modules you want, and create an apk including python, libs, and your application.
Python micro web-framework and asynchronous networking library, support Python 3.x
Pilbox is an image resizing application server built on Python's Tornado web framework using the Python Imaging Library (Pillow). It is not intended to be the primary source of images, but instead acts as a proxy which requests images and resizes them as desired.
The windML framework provides an easy-to-use access to wind data sources within the Python world, building upon numpy, scipy, sklearn, and matplotlib. As a machine learning module, it provides versatile tools for various learning tasks like time-series prediction, classification, clustering, dimensionality reduction, and related tasks.
Open content for self-directed learning in data science. It has a collection of Data Science Learning materials in the form of IPython Notebooks and associated data sets.
Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python. Programmers can use it to easily add search functionality to their applications and websites. Every part of how Whoosh works can be extended or replaced to meet your needs exactl
A python script for summarizing articles using nltk.
A python library for genetic algorithms.
Kuyruk is a simple and easy way of distributing tasks to run on servers. It uses RabbitMQ as message broker and depends on Pika which is a pure-Python RabbitMQ client library.
Object data mapper and advanced query manager for non relational databases. Designed to work with Redis data-store, it now has an experimental implementation for Mongodb.
pyHarmonySearch is a pure Python implementation of the harmony search (HS) global optimization algorithm.
IOPro provides a version of pyodbc containing extra methods that load data directly into NumPy ndarrays. With IOPro, queries execute several times faster, and memory usage can be 10x-16x lower than with default pyodbc and pandas methods. This post presents graphs and figures showing improved execution time and memory usage from the use of these new method.
Python module to easily create threadpools with additional advanced functionality.
django-basic-stats is a simple traffic statistics application. It show latest referrer, google queried terms or overall hits count. It also provides optional logging and statistics for mobile devices (user agent, screen and window width/height, device pixel ratio).
Quokka is a flexible content management platform powered by Python, Flask and MongoDB. Quokka provides a "full-stack" Flask application plus a bunch of selected extensions to provide all the needed CMS admin features and a flexible-easy way to extend the platform with quokka-modules built following the Flask Blueprints pattern.
Hachoir is a Python library that allows to view and edit a binary stream field by field. In other words, Hachoir allows you to "browse" any binary stream just like you browse directories and files. A file is split in a tree of fields, where the smallest field is just one bit.
Scalpel is text analyzing tool that implements and integrates various text analyzing and processing algorithms and packages. The approach and design of Scalpel tries to make the library maximal usable, clear and understandable for researchers and developers. One of the main goal is seamless integration various third-party text processing libraries (TNT, Lingpipe, Stanford, etc) under common ruff with unified common interface.
Painlessly create beautiful matplotlib plots
twosheds
twosheds is a library, written in Python, for making command language interpreters, or shells.
Shells like bash are very powerful, but they require you to learn C or clunky domain-specific scripting languages to extend and customize. twosheds lets you write your own shell, in Python, which means you can customize it completely: