Differences between revisions 242 and 243
|Deletions are marked like this.||Additions are marked like this.|
|Line 260:||Line 260:|
|* [http://myokit.org Myokit]: A programming toolkit for working with ODE models of cardiac myocytes (and other excitable tissues).|
This page indexes add-on software and other resources relevant to SciPy, categorized by scientific discipline or computational topic. It is intended to be exhaustive. If you know of an unlisted resource, see About This Page, below.
You can also check our code sharing / software list site: http://scipy-central.org/
- About This Page
- General Python resources
- Tutorials and texts
- Working environments
- Science: basic tools
- Running Code Written In Other Languages
- Plotting, data visualization, 3-D programming
- Systems of nonlinear equations
- Automatic differentiation
- Finite differences derivatives approximation
- Data Storage / Database
- Parallel and distributed programming
- Partial differential equation (PDE) solvers
- Topic guides, organized by scientific field
About This Page
As the links in each section become numerous, they may be placed on separate pages of their own, so be sure to visit those pages as well, if they're relevant to your search. The listings are roughly organized by topic, with introductory resources first, more general topics next, and discipline-specific resources last.
Unless otherwise indicated, all packages listed here are provided under some form of open source license.
If you distribute or know of a resource that is not listed here, please add a listing. Make an account, login, visit the wiki. You'll now see an "edit" tab at the top. Click it and go to it. If you need help, there is a help link at the top of the page. Please follow the general format of the existing listings. Be as brief as you can, but make sure you include in your text a link to the resource's home page and some keywords that potential users might search for to find the resource. Finally, when you have saved, check out your listing and test the link to make sure it works.
In addition, please also list your software on http://scipy-central.org/
If you wish to restructure the page (e.g., break out a section into its own page, change the section headings, etc.), please propose the idea to the firstname.lastname@example.org mailing list first and get community feedback.
General Python resources
Python.org: official website for the Python language. It includes links to the current documentation and tutorials, downloads for many platforms, the Python mailing lists and newsgroups, and much more.
Python Package Index (PyPI): the official Python.org package index (the Python standard distribution system, distutils, includes support for automatically registering packages with PyPI).
PyCode: Another collection of python packages and resources.
The Python Cookbook: a community-driven collection of code snippets for many tasks.
The O'Reilly Python DevCenter: O'Reilly is widely regarded as one of the best computing book publishers, and they maintain a resource center devoted to Python. This includes both their publications and articles on Python-related topics.
The Python Learning Foundation: a large collection of Python links.
Tutorials and texts
Some generic Python/programming tutorials:
The standard Python docs : this contains the official documentation and tutorials which ship with the language.
Learning to Program: beginner's tutorial.
How to Think Like a Computer Scientist: another beginner's tutorial.
Many more: an external collection with over 100 tutorials.
And some more specifically geared towards scientific computing:
The main Numpy and Scipy documentation.
The user guide for the new NumPy system. Numeric and Numarray have been unified into the new NumPy environment. This document covers all the details of the new system, and was written by its lead developer.
A tutorial focused on interactive data analysis for astronomy, but of generic utility to most scientific users. Developed at the STSCI, available for free download including all data files necessary to run the examples.
Konrad Hinsen's Python page: contains a number of introductions and tutorials to Python, geared towards the needs of scientists.
Python Scripting for Computational Science: not free, this is a Springer book.
Python/Matlab/Octave/Scilab/R/Gnuplot/IDL/Axiom cross-reference by Vidar Gundersen.
Software Carpentry is an open source course on basic software development skills for people with backgrounds in science, engineering, and medicine.
IPython: an interactive environment with many features geared towards efficient work in typical scientific usage. It borrows many ideas from the interactive shells of Mathematica, IDL, Matlab and similar packages. It includes special support for the matplotlib and gnuplot plotting packages. IPython also has support for (X)Emacs, to be used as a full IDE with IPython as the interactive Python shell.
Spyder (formerly called Pydee): A Qt based IDE suited to developing scientific applications. Includes integrated and external python consoles, code checking built into the editor, a graphical class browser and full support for matplotlib graphs.
IEP: a cross-platform Python IDE focused on interactivity and introspection, which makes it very suitable for scientific computing. Its practical design is aimed at simplicity and efficiency. IEP consists of two main components, the editor and the shell, and uses a set of pluggable tools to help the programmer in various ways. Some example tools are source structure, project manager, interactive help, workspace ...
Pymacs: a tool which, once started from Emacs, allows both-way communication between Emacs Lisp and Python.
Other IDE links: the official Python website maintains a comprehensive lists of Integrated Development Environments for Python.
links centered on the ubuntu distribution.
Science: basic tools
These are links which cover basic tools generally useful for scientific work in almost any area. Many of the more specific packages listed later depend on one or more of these.
SciPy: umbrella project which includes a variety of high level science and engineering modules together as a single package. SciPy includes modules for linear algebra (including wrappers to BLAS and LAPACK), optimization, integration, special functions, FFTs, signal and image processing, ODE solvers, and others.
NumPy is the package SciPy builds on and requires as a pre-requisite. It is a hybrid of both Numeric and Numarray incorporating features of both. If you are new to Numeric computing with Python, you should use NumPy.
Numerical Python and Numarray: these packages are the predecessos for NumPy. Numerical Python is now deprecated. According to Perry Greenfield at STSCI which funded Numarray development. It will be supported until about the end of 2007 (supported provided mainly by Todd Miller) while all code based on Numarray is ported to use NumPy.
http://dirac. cnrs-orleans.fr/ScientificPython/ : another collection of Python modules for scientific computing. It includes basic geometry (vectors, tensors, transformations, vector and tensor fields), quaternions, automatic derivatives, (linear) interpolation, polynomials, elementary statistics, nonlinear least-squares fits, unit calculations, Fortran-compatible text formatting, 3D visualization via VRML, and two Tk widgets for simple line plots and 3D wireframe models. There are also interfaces to the netCDF library (portable structured binary files), to MPI (Message Passing Interface, message-based parallel programming), and to BSPlib (Bulk Synchronous Parallel programming). Much of this functionality has been incorporated into SciPy, but not all.
Numexpr: a package that accepts numpy array expressions as strings, rewrites them to optimize execution time and memory use, and executes them much faster than numpy usually can.
GMPY: a python interface for the GNU Multiple Precision library (gmp).
RPy: a very simple, yet robust, Python interface to the R Programming Language. It can manage all kinds of R objects and can execute arbitrary R functions (including the graphic functions). All errors from the R language are converted to Python exceptions. Any module installed for the R system can be used from within Python.
Enthought Python Distribution: for Windows, OSX, and RedHat users. This is a very useful download, in a single package, of a number of different tools for scientific computing (including many listed in this page). This saves users the hassles of manually building all of these packages, some of which can be fairly difficult to get to work. EPD is free for academic and non-profit use, but fee-based for commercial and governmental use.
Python(x,y): A complete distribution for Windows or Ubuntu users containing all the packages needed for full Python distribution for scientific development, including Qt based GUI design. Also includes Spyder (formerly called Pydee), a Python IDE suited to scientific development.
PyROOT, a run-time based python binding to the ROOT framework: ROOT is a complete system for development of scientific applications, from math and graphics libraries, to efficient storage and reading of huge data sets, to distributed analysis. The python bindings are based on run-time type information, such that you can add your own C++ classes on the fly to the system with a one-liner and down-casting as well as pointer manipulations become unnecessary. Using RTTI keeps memory and call overhead down to a minimum, resulting in bindings that are more light-weight and faster than any of the "standard" bindings generators.
PyDX, automatic differentiation, arbitrary precision arithmetic, interval arithmetic, interval ODE solver, differential geometry constructs.
NetworkX, Python package for the creation, manipulation, and study of the structure, dynamics, and function of complex networks.
PyAMG, a library of Algebraic Multigrid (AMG) solvers for large scale linear algebra problems.
PyTrilinos Python interface to Trilinos, a framework for solving large-scale, complex multi-physics engineering and scientific problems.
PyIMSLStudio is a complete packaged, supported and documented development environment for Windows and Red Hat designed for prototyping mathematics and statistics models and deploying them into production applications. PyIMSL Studio includes wrappers for the IMSL Numerical Library, a Python distribution and a selection of open source python modules useful for prototype analytical development. PyIMSL Studio is available for download at no charge for non-commercial use or for commercial evaluation.
Running Code Written In Other Languages
Wrapping C, C++, and FORTRAN Codes
SWIG: SWIG is a software development tool that connects programs written in C and C++ with a variety of high-level programming languages. SWIG is primarily used with common scripting languages such as Perl, Python, Tcl/Tk and Ruby. The SWIG Typemaps page SWIG modifications for usage with Numeric arrays.
Boost.Python: a C++ library which enables seamless interoperability between C++ and Python. The PythonInfo Wiki contains a good howto reference. "c++-sig":http://www.python.org/sigs/c++-sig/ at python.org is devoted to Boost and you can subscribe to their mailing list. Some personal notes can be found at [Boost.Notes].
F2PY: provides a connection between the Python and Fortran languages. F2PY is a Python extension tool for creating Python C/API modules from (handwritten or F2PY generated) signature files (or directly from Fortran sources).
Weave: allows the inclusion of C/C++ within Python code. It has facilities for automatic creation of C/C++ based Python extension modules, as well as for direct inlining of C/C++ code in Python sources. The latter combines the scripting flexibility of Python with the execution speed of compiled C/C++, while handling automatically all module generation details.
PyCxx: CXX/Objects is a set of C++ facilities to make it easier to write Python extensions. The chief way in which PyCXX makes it easier to write Python extensions is that it greatly increases the probability that your program will not make a reference-counting error and will not have to continually check error returns from the Python C API.
ctypes: a package to create and manipulate C data types in Python, and to call functions in dynamic link libraries/shared dlls. It allows wrapping these libraries in pure Python.
railgun: ctypes utilities for faster and easier simulation programming in C and Python
Instant Instant is a Python module that allows for instant inlining of C and C++ code in Python. It is a small Python module built on top of SWIG.
Converting Code From Other Array Languages
IDL: The Interactive Data Language from ITT
Plotting, data visualization, 3-D programming
Tools with a (mostly) 2-D focus
matplotlib: a Python 2-D plotting library which produces publication quality figures using in a variety of hardcopy formats (PNG, JPG, PS, SVG) and interactive GUI environments (WX, GTK, Tkinter, FLTK, Qt) across platforms. matplotlib can be used in python scripts, interactively from the python shell (ala matlab or mathematica), in web application servers generating dynamic charts, or embedded in GUI applications. For interactive use, IPython provides a special mode which integrates with matplotlib. See the matplotlib cookbook for recipes.
Chaco: Chaco is a Python toolkit for producing interactive plotting applications. Chaco applications can range from simple line plotting scripts up to GUI applications for interactively exploring different aspects of interrelated data. As an open-source project being developed by Enthought, Chaco leverages other Enthought technologies such as Kiva, Enable, and Traits to produce highly interactive plots of publication quality.
PyQwt: a set of Python bindings for the Qwt C++ class library which extends the Qt framework with widgets for scientific and engineering applications. It provides a widget to plot 2-dimensional data and various widgets to display and control bounded or unbounded floating point values.
[Documentation/xplt xplt] In the scipy sandbox there is a fast 2-d plotting library that can be useful for some applications. It is an extension of pygist.
HippoDraw:a highly interactive data analysis environment. It is written in C++ with the Qt library from Trolltech. It includes Python bindings, and has a number of features for the kinds of data analysis typical of High Energy physics environments, as it includes native support for ROOT NTuples. It is well optimized for real-time data collection and display.
Biggles: a module for creating publication-quality 2D scientific plots. It supports multiple output formats (postscript, x11, png, svg, gif), understands simple TeX, and sports a high-level, elegant interface.
Gnuplot.py: a Python package that interfaces to gnuplot, the popular open-source plotting program. It allows you to use gnuplot from within Python to plot arrays of data from memory, data files, or mathematical functions. If you use Python to perform computations or as `glue' for numerical programs, you can use this package to plot data on the fly as they are computed. IPython includes additional enhancements to Gnuplot.py (but which require the base package) to make it more efficient in interactive usage.
disipyl: an object-oriented wrapper around the DISLIN plotting library, written in the computer language Python. disipyl provides a set of classes which represent various aspects of DISLIN plots, as well as providing some easy to use classes for creating commonly used plot formats (e.g. scatter plots, histograms, 3-D surface plots). A major goal in designing the library was to facilitate interactive data exploration and plot creation.
OpenCv: mature library for Image Processing, Structural Analysis, Motion Analysis and Object Tracking, and Pattern Recognition that has recently added Swig based Python bindings. Windows and Linux-RPM packages available. An open source project originally sponsored by Intel, can be coupled with Intel Performance Primitive package (IPP) for increased performance. Has a Wiki here
PyChart: a library for creating Encapsulated Postscript, PDF, PNG, or SVG charts. It currently supports line plots, bar plots, range-fill plots, and pie charts.
pygame: though intended for writing games using Python, its general-purpose multimedia libraries definitely have other applications in visualization.
PyNGL: a Python module for creating publication-quality 2D visualizations, with emphasis in the geosciences. PyNGL can create contours, vectors, streamlines, XY plots, and overlay any one of these on several map projections. PyNGL's graphics are based on the same high-quality graphics as the NCAR Command Language and NCAR Graphics.
Data visualization (mostly 3-D, surfaces and volumetric rendering)
MayaVi: a free, easy to use scientific data visualizer. It is written in Python and uses the amazing Visualization Toolkit (VTK) for the graphics. It provides a GUI written using Tkinter. MayaVi supports visualizations of scalar, vector and tensor data in a variety of ways, including meshes, surfaces and volumetric rendering. MayaVi can be used both as a standalone GUI program and as a Python library to be driven by other Python programs.
Mayavi2 is the successor of MayaVi. It is vastly superior to MayaVi1, has a Pythonic API, supports numpy arrays transparently, provides a powerful application, reusable library and a powerful pylab like equivalent called mlab for rapid 3D plotting.
visvis: a pure Python library for visualization of 1D to 4D data in an object oriented way. Essentially, visvis is an object oriented layer of Python on top of OpenGl, thereby combining the power of OpenGl with the usability of Python. A Matlab-like interface in the form of a set of functions allows easy creation of objects (e.g. plot(), imshow(), volshow(), surf()).
Py-OpenDX : Py-OpenDX is a Python binding for the OpenDX API. Currently only the DXLink library is wrapped, though this may be expanded in the future to cover other DX libraries such as CallModule and DXLite.
IVuPy: (I-View-Py) serves to develop Python programs for 3D visualization of huge data sets using Qt and PyQt. IVuPy interfaces more than 600 classes of two of the Coin3D C++ libraries to Python, integrates very well with PyQt, and is fun to program. Coin3D is a scene graph library, and is optimized for speed. In comparison with VTK, Coin3D is more low level and lacks many of VTK's advanced visualization and imaging algorithms.
Pivy is another Coin3D binding for Python. Pivy allows the development of Coin3D applications and extensions in Python, interactive modification of Coin3D programs from within the Python interpreter at runtime and incorporation of Scripting Nodes into the scene graph which are capable of executing Python code and callbacks.
Mat3D provides a few routines for basic 3D plotting. It makes use of OpenGL and is written in Python and Tk. One can interact (rotate and zoom) with with the generated graph and the view can be saved to an image.
S2PLOT is a three-dimensional plotting library based on OpenGL with support for standard and enhanced display devices. The S2PLOT library was written in C and can be used with C, C++, FORTRAN and Python programs on GNU/Linux, Apple/OSX and GNU/Cygwin systems. The library is currently closed-source, but free for commercial and academic use. They are hoping for an open source release towards the end of 2008.
LaTeX, PostScript, diagram generation
PyX: a package for the creation of encapsulated PostScript figures. It provides both an abstraction of PostScript and a TeX/LaTeX interface. Complex tasks like 2-D and 3-D plots in publication-ready quality are built out of these primitives.
Dot2TeX: Another tool in the Dot/Graphviz/LaTeX family, this is a Graphviz to LaTeX converter. The purpose of dot2tex is to give graphs generated by Graphviz a more LaTeX friendly look and feel. This is accomplished by converting xdot output from Graphviz to a series of PSTricks or PGF/TikZ commands.
pyreport: runs a script and captures the output (pylab graphics included). Generates a LaTeX or pdf report out of it, including litteral comments and pretty printed code.
Other 3-D programming tools
VPython: a Python module that offers real-time 3D output, and is easily usable by novice programmers.
OpenRM Scene Graph: a developers toolkit that implements a scene graph API, and which uses OpenGL for hardware accelerated rendering. OpenRM is intended to be used to construct high performance, portable graphics and scientific visualization applications on Unix/Linux/Windows platforms.
Panda3D: an open source game and simulation engine.
Python Computer Graphics Kit: a collection of Python modules that contain the basic types and functions required for creating 3D computer graphics images.
PyGeo: a Dynamic 3-D geometry laboratory. PyGeo may be used to explore the most basic concepts of Euclidean geometry at an introductory level, including by elementary schools students and their teachers. But is particularly suitable for exploring more advanced geometric topics --- such as projective geometry and the geometry of complex numbers.
Python 3-D software collection: A small collection of pointers to Python software for working in three dimensions.
pyFormex: a program for generating, transforming and manipulating large geometrical models of 3D structures by sequences of mathematical operations.
SpaceFuncs: a tool for 2D, 3D, N-dimensional geometric modeling with possibilities of parametrized calculations, numerical optimization and solving systems of geometrical equations with automatic differentiation.
APLEpy: A Python modeling tool for linear and mixed-integer linear programs.
Coopr: Coopr is a collection of Python optimization-related packages that supports a diverse set of optimization capabilities for formulating and analyzing optimization models.
CVExp: Expression Tree Builder and Translator based on a Controlled Vocabulary
CVXOPT (license: GPL3), a tool for convex optimization which defines its own matrix-like object and interfaces to FFTW, BLAS, and LAPACK.
DEAP: Distributed Evolutionary Algorithms in Python]
ECsPy: Evolutionary Computations in Python
EMMA: A Python optimization library with a focus on constraint programming
Mystic: An optimization framework focused on continuous optimization.
NLPy: A Python optimization framework that leverages AMPL to create problem instances, which can then be processed in Python
OpenOpt (license: BSD) - numerical optimization framework with some own solvers and connections to lots of other. It allows connection of any-licensed software, while scipy.optimize allows only copyleft-free one (like BSD, MIT). Other features are convenient standard interface for all solvers, graphical output, categorical variables, diskunctive and other logical constraints, automatic 1st derivatives check, multifactor analysis tool for experiment planning and much more. You can optimize FuncDesigner models with Automatic differentiation. OpenOpt website also hosts numerical optimization forum. OpenOpt has commercial addon (free for small-scale research/educational problems) for stochastic programming.
PuLP: A Python package that can be used to describe linear programming and mixed-integer linear programming optimization problems
PyEvolve Genetic Algorithms in Python
PyLinpro: A pure simplex tableau solver for linear programming
Pyiopt: A Python interface to the COIN-OR Ipopt solver
python-zibopt: A Python interface to SCIP
scikits.optimization is a generic optimization framework entirely written in Python
lmfit-py is a wrapper around scipy.optimize.leastsq that uses named fitting parameters which may be varied, fixed, or constrained with simple mathematical expressions.
Systems of nonlinear equations
fsolve from scipy.optimize
SNLE from OpenOpt - can perform automatic differentiation; also, one of its solvers interalg, based on interval analysis, is capable of yielding all solutions inside any user-defined region lb_i <= x_i <= ub_i
FuncDesigner - also can solve ODE and use OpenOpt for numerical optimization, perform uncertainty and interval analysis
pycppad - wrapper for CppAD, second order forward/reverse
pyadolc - wrapper for ADOL-C, arbitrary order forward/reverse
Finite differences derivatives approximation
check_grad from scipy.optimize
Data Storage / Database
pyhdf: pyhdf is a python interface to the HDF4 library. Among the numerous components offered by HDF4, the following are currently supported by pyhdf: SD (Scientific Dataset), VS (Vdata), V (Vgroup) and HDF (common declarations).
Parallel and distributed programming
For a brief discussion of parallel programming within numpy/scipy, see ParallelProgramming.
PyMPI: Distributed Parallel Programming for Python! This package builds on traditional Python by enabling users to write distributed, parallel programs based on MPI message passing primitives. General python objects can be messaged between processors.
Pypar: Parallel Programming in the spirit of Python! Pypar is an efficient but easy-to-use module that allows programs/scripts written in the Python programming language to run in parallel on multiple processors and communicate using message passing. Pypar provides bindings to an important subset of the message passing interface standard MPI.
jug is a task based parallel framework. It is especially useful for embarassingly parallel problems such as parameter sweeps. It can take advantage of a multi-core machine or a set of machines on a computing cluster.
MPI for Python: Object Oriented Python bindings for the Message Passing Interface. This module provides MPI suport to run Python scripts in parallel. It is constructed on top of the MPI-1 specification, but provides an object oriented interface which closely follows stantard MPI-2 C++ bindings. Any picklable Python object can be communicated. There is support for point-to-point (sends, receives) and collective (broadcasts, scatters, gathers) communications as well as group and communicator (inter, intra and topologies) management.
PyPVM: A Python interface to Parallel Virtual Machine (PVM), a portable heterogeneous message-passing system. It provides tools for interprocess communication, process spawning, and execution on multiple architectures.
Module Scientific.BSP in Konrad Hinsen's ScientificPython provides an experimental interface to the Bulk Synchronous Parallel (BSP) model of parallel programming (note the link to the BSP tutorial on the ScientificPython page). Module Scientific.MPI provides an MPI interface. The BSP model is an alternative to MPI and PVM message passing model. It is said to be easier to use than the message passing model, and is guaranteed to be deadlock-free.
Pyro: PYthon Remote Objects (Pyro) provides an object-oriented form of RPC. It is a Distributed Object Technology system written entirely in Python, designed to be very easy to use. Never worry about writing network communication code again, when using Pyro you just write your Python objects like you would normally. With only a few lines of extra code, Pyro takes care of the network communication between your objects once you split them over different machines on the network. All the gory socket programming details are taken care of, you just call a method on a remote object as if it were a local object!
PyXG: Object oriented Python interface to Apple's Xgrid. PyXG makes it possible to submit and manage Xgrid jobs and tasks from within interactive Python sessions or standalone scripts. It provides an extremely lightweight method for performing independent parallel tasks on a cluster of Macintosh computers.
Pyslice: Pyslice is a specialized templating system that replaces variables in a template data set with numbers taken from all combinations of variables. It creates a dataset from input template files for each combination of variables in the series and can optionally run a simulation or submit a simulation run to a gueue against each created data set. For example: create all possible combination of datasets that represent the 'flow' variable with numbers from 10 to 20 by 2 and the 'level' variable with 24 values taken from a normal distribution with a mean of 104 and standard deviation of 5.
Python::OpenCL: OpenCL is a standard for parallel programming on heterogeneous devices including CPUs, GPUs, and others processors. It provides a common language C-like language for executing code on those devices, as well as APIs to setup the computations. Python::OpenCL aims at being an easy-to-use Python wrapper around the OpenCL library.
PyCSP: Communicating Sequential Processes for Python. PyCSP may be used to structure scientific software into concurrent tasks. Dependencies are handled through explicit communication and allows for better understanding of the structure. A PyCSP application can be executed using co-routines, threads or processes.
Partial differential equation (PDE) solvers
Topic guides, organized by scientific field
AstroPy: Central repository of information about Python and Astronomy.
AstroPython: Knowledge base for research in astronomy using Python.
PyRAF: a new command language for running IRAF tasks that is based on the Python scripting language.
BOTEC: a simple astrophysical and orbital mechanics calculator, including a database of all named Solar System objects.
AstroLib: an open source effort to develop general astronomical utilities akin to those available in the IDL ASTRON package
APLpy: a Python module aimed at producing publication-quality plots of astronomical imaging data in FITS format.
Tutorial: Using Python for interactive data analysis in astronomy.
ParselTongue: A Python interface to classic AIPS for the calibration, data analysis, image display etc. of (primarily) Radio Astronomy data.
Casa a suite of C++ application libraries for the reduction and analysis of radioastronomical data (derived from the former AIPS++ package) with a Python scripting interface.
Healpy Python package for using and plotting HEALpix data (e.g. spherical surface maps such as WMAP data).
Pysolar Collection of Python libraries for simulating the irradiation of any point on earth by the sun. Pysolar includes code for extremely precise ephemeris calculations, and more. Could be also grouped under engineering tools.
pywcsgrid2 display astronomical fits images with matplotlib
pyregion python module to parse ds9 region files (also support ciao regions files)
SpacePy provides tools for the exploration and analysis of data in the space sciences. Features include a Pythonic interface to NASA CDF, time and coordinate conversions, a datamodel for manipulation of data and metadata, empirical models widely used in space science, and tools for everything from statistical analysis to multithreading.
Artificial intelligence & machine learning
See also the Bayesian Statistics section below
scikit learn General purpose efficient machine learning and data mining library in Python, for scipy.
ffnet Feed-forward neural network for python, uses numpy arrays and scipy optimizers.
pyem is a tool for Gaussian Mixture Models. It implements EM algorithm for Gaussian mixtures (including full matrix covariances), BIC criterion for clustering. Since october 2006, it is included in scipy toolbox.
PyBrain Machine learning library with focus on reinforcement learning, (recurrent) neural networks and black-box optimization.
Orange component-based data mining software.
pymorph Morphology Toolbox The pymorph Morphology Toolbox for Python is a powerful collection of latest state-of-the-art gray-scale morphological tools that can be applied to image segmentation, non-linear filtering, pattern recognition and image analysis. Pymorph was originally written by Roberto A. Lutofu and Rubens C. Machado but is now maintained by Luis Pedro Coelho.
pycplex A Python interface to the ILOG CPLEX Callable Library.
ELEFANT We aim at developing an open source machine learning platform which will become the platform of choice for prototyping and deploying machine learning algorithms.
Bayes Blocks The library is a C++/Python implementation of the variational building block framework using variational Bayesian learning.
Monte python A machine learning library written in pure Python. The focus is on gradient based learning. Monte includes neural networks, conditional random fields, logistic regression and more.
hcluster: A hierarchical clustering library for Scipy with base implementation written in C for efficiency. Clusters data, computes cluster statistics, and plots dendrograms.
PyPR A collection of machine learning methods written in Python: Artificial Neural Networks, Gaussian Processes, Gaussian mixture models, and K-means.
Theano: A CPU and GPU Math Expression Compiler: Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently.
NeuroLab: Neurolab is a simple and powerful Neural Network Library for Python.
PyMC: PyMC is a Python module that provides a Markov chain Monte Carlo (MCMC) toolkit, making Bayesian simulation models relatively easy to implement. PyMC relieves users of the need for re-implementing MCMC algorithms and associated utilities, such as plotting and statistical summary. This allows the modelers to concentrate on important aspects of the problem at hand, rather than the mundane details of Bayesian statistical simulation.
PyBayes: PyBayes is an object-oriented Python library for recursive Bayesian estimation (Bayesian filtering) that is convenient to use. Already implemented are Kalman filter, particle filter and marginalized particle filter, all built atop of a light framework of probability density functions. PyBayes can optionally use Cython for lage speed gains (Cython build is several times faster).
Biology (including Neuroscience)
Brian: a simulator for spiking neural networks in Python.
BioPython: an international association of developers of freely available Python tools for computational molecular biology.
PyCogent: a software library for genomic biology.
Python For Structural BioInformatics Tutorial: This tutorial will demonstrate the utility of the interpreted programming language Python for the rapid development of component-based applications for structural bioinformatics. We will introduce the language itself, along with some of its most important extension modules. Bio-informatics specific extensions will also be described and we will demonstrate how these components have been assembled to create custom applications.
PySAT: Python Seqeuence Analysis Tools (Version 1.0) PySAT is a collection of bioinformatics tools written entirely in python. A paper describing these tools.
Python Protein Annotators' Assistant In this project, a software tool has been developed which, given a list of protein identifiers, e.g. as returned by a BLAST or FASTA search, clusters the identifiers around keywords and phrases that might indicate the functions performed by the protein that was used in the original search query.
Python/Tk Viewer for the NCBI Taxonomy Database A viewer for the NCBI taxonomy database, written in Python/Tk, was developed in 1998.
PySCeS: the Python Simulator for Cellular Systems: PySCes includes tools for the simulation and analysis of cellular systems (GPL).
PyDSTool: PyDSTool is an integrated simulation, modeling and analysis package for dynamical systems used in scientific computing, and includes special toolboxes for computational neuroscience, biomechanics, and systems biology applications.
NIPY: The neuroimaging in python project is an environment for the analysis of structural and functional neuroimaging data. It currently has a full system for general linear modeling of functional magnetic resonance imaging (FMRI).
ACQ4: Data acquisition and analysis system for electrophysiology, photostimulation, and fluorescence imaging.
Vision Egg: produce stimuli for vision research experiments
PsychoPy: create psychology stimuli in Python
pyQPCR: a GUI application that allows to compute quantitative PCR (QPCR) raw data. Using quantification cycle values extracted from QPCR instruments, it uses a proven and universally applicable model (Delta-delta ct method) to give finalized quantification results.
VeSPA: The VeSPA suite contains three magnetic resonance (MR) spectroscopy applications: RFPulse (for RF pulse design), Simulation (for spectral simulation), and Analysis (for spectral data processing and analysis).
Neo: A package for representing electrophysiology data in Python, together with support for reading a wide range of neurophysiology file formats.
Myokit: A programming toolkit for working with ODE models of cardiac myocytes (and other excitable tissues).
PyDSTool: PyDSTool is an integrated simulation, modeling and analysis package for dynamical systems (ODEs, DDEs, DAEs, maps, time-series, hybrid systems). Continuation and bifurcation analysis tools are built-in, via PyCont. It also contains a library of general classes useful for scientific computing, including an enhanced array class and wrappers for SciPy algorithms. Application-specific utilities are also provided for systems biology, computational neuroscience, and biomechanics. Development of complex systems models is simplified using symbolic math capabilities and compositional model-building classes. These can be "compiled" automatically into dynamically-linked C code or Python simulators.
Simpy: SimPy (= Simulation in Python) is an object-oriented, process-based discrete-event simulation language based on standard Python. It is released under the GNU Lesser GPL (LGPL). SimPy provides the modeler with components of a simulation model including processes, for active components like customers, messages, and vehicles, and resources, for passive components that form limited capacity congestion points like servers, checkout counters, and tunnels. It also provides monitor variables to aid in gathering statistics. Random variates are provided by the standard Python random module. SimPy comes with data collection capabilities, GUI and plotting packages. It can be easily interfaced to other packages, such as plotting, statistics, GUI, spreadsheets, and data bases.
Pyarie: Pyarie is a continuous modeling environment useful for modeling systems of ordinary differential equations. The system is designed to be modular so that state variables and relationships, as well as complete models, can be re-used and re-defined and combined. Multiple integration methods are supplied for ODEs, and tools for optimization and linear programming are currently being built. Pyarie is being designed so little to no knowledge of programming is necessary for its use, but with full access to its structures, so that programmers can extend the system at will and use it as a powerful continuous modeling programming language.
Model-Builder. Model-Builder is a GUI-based application for building and simulation of ODE (Ordinary Differential Equations) models. Models are defined in mathematical notation, with no coding required by the user. Results can be exported in csv format. Graphical output based on matplotlib include time-series plots, state-space plots, Spectrogram, Continuous wavelet transforms of time series. It also includes a sensitivity and uncertainty analysis module. Ideal for classroom use.
VFGEN: VFGEN is a source code generator for differential equations and delay differential equations. The equations are defined once in an XML format, and then VFGEN is used to generate the functions that implement the equations in a wide variety of formats. Python users will be interested in the SciPy, PyGSL, and PyDSTool commands provided by VFGEN.
GarlicSim: GarlicSim is a framework for working with simulations. It is general, and not specific to any field of study. GarlicSim takes the "world state" and the "step function" concepts as the basic elements of the simulation, and builds on that, allowing users to rapidly develop simulations in a modular, object-oriented fashion.
DAE Tools: DAE Tools is a cross-platform equation-oriented process modelling and optimization software. Various types of processes (lumped or distributed, steady-state or dynamic) can be modelled and optimized. Equations can be ordinary or discontinuous, where discontinuities are automatically handled by the framework. The simulation/optimization results can be plotted and/or exported into various formats. Currently, Sundials IDAS solver is used to solve DAE systems and calculate sensitivities, BONMIN, IPOPT, and NLOPT solvers are used to solve NLP/MINLP problems, while various direct/iterative sparse matrix linear solvers are interfaced: SuperLU and SuperLU_MT, Intel Pardiso, AMD ACML, Trilinos Amesos (KLU, Umfpack, SuperLU, Lapack) and Trilinos AztecOO (with built-in, Ifpack or ML preconditioners). Linear solvers that exploit GPGPUs are also available (SuperLU_CUDA, CUSP; still in an early development stage).
Economics and Econometrics
pyTrix: a small set of utilities for economics and econometrics, including pyGAUSS (GAUSS command analogues for use in scipy).
pandas: data structures and tools for cross-sectional and time series data sets
Electromagnetics and Electrical Engineering
PyFemax: computation of electro-magnetic waves in accelerator cavities.
EMPy (Electromagnetic Python): Various common algorithms for electromagnetic problems and optics, including the transfer matrix algorithm and rigorous coupled wave analysis.
Optics of multilayer films, including the transfer-matrix method, coherent and incoherent propagation, and depth-dependent absorption profiles.
openTMM: An electrodynamic S-matrix (transfer matrix) code with modern applications.
pyofss analyzes optical fibre telecommunication systems, including numerically integrating the appropriate appropriate Schrödinger-type equation to calculate fibre dispersion.
electrode, a toolkit to develop and analyze rf surface ion traps.
mwavepy: Compilation of functions for microwave/RF engineering. Useful for tasks such as calibration, data analysis, data acquisition, and plotting functions.
netana: Electronic Network Analyzer, solves electronic AC & DC Mash and Node network equations using matrix algebra.
seawater is a package for computing properties of seawater (UNESCO 1981 and UNESCO 1983).
Fluid is a series of routines for calulating properties of fluids (air and seawater), and their interactions (e.g., wind stess).
atmqty computes atmospheric quantities on earth.
PyClimate - Analysis of climate data in Python performs EOF analysis, downscaling by means of CCA and analogs (in the PC and CCC spaces), linear digital filters, kernel based probability density function estimation and access to DCDFLIB.C library from Python, amongst many other things.
ClimPy Hydrologic orientated library
GIS Python Python programs and libraries for geodata processing
MGLTOOLS: a comprehensive set of tools for molecular interaction calculations and visualization.
Biskit: an object-oriented platform for structural bioinformatics research. Structure and trajectory objects tightly integrate with numpy allowing, for example, fast take and compress operations on molecules or trajectory frames. Biskit integrates many external programs (e.g. XPlor, Modeller, Amber, DSSP, T-Coffee, Hmmer...) into workflows and supports parallelization via a high-level access to pyPvm.
PyMOL: a molecular graphics system with an embedded Python interpreter designed for real-time visualization and rapid generation of high-quality molecular graphics images and animations.
UCSF Chimera: UCSF Chimera is a highly extensible, interactive molecular graphics program. It is the successor to UCSF Midas and MidasPlus; however, it has been completely redesigned to maximize extensibility and leverage advances in hardware. UCSF Chimera can be downloaded free of charge for academic, government, non-profit, and personal use.
The Python Macromolecular Library (mmLib): a software toolkit and library of routines for the analysis and manipulation of macromolecular structural models. It provides a range of useful software components for parsing mmCIF, PDB, and MTZ files, a library of atomic elements and monomers, an object-oriented data structure describing biological macromolecules, and an OpenGL molecular viewer.
MDTools for Python: MDTools is a Python module which provides a set of classes useful for the analysis and modification of protein structures. Current capabilities include reading psf files, reading and writing (X-PLOR style) pdb and dcd files, calculating phi-psi angles and other properties for arbitrary selections of residues, and parsing output from NAMD into an easy-to-manipulate data object.
BALL - Biochemical Algorithms Library: a set of libraries and applications for molecular modeling and visualization. OpenGL and Qt are the underlying C++ layers; some components are LGPL licensed, others GPL.
PyVib2: a program for analyzing vibrational motion and vibrational spectra. The program is supposed to be an open source "all-in-one" solution for scientists working in the field of vibrational spectroscopy (Raman and IR) and vibrational optical activity (ROA and VCD). It is based on numpy, matplotlib, VTK and Pmw.
ASE is an atomistic simulation environment written in Python with the aim of setting up, stearing, and analyzing atomistic simulations. It can use a number of backend calculation engines (Abinit, Siesta, Vasp, Dacapo, GPAW, ...) to perform ab-initio calculations within Density Functional Theory. It can do total energy calculations, molecular dynamics, geometry optimization and much more. There is also a GUI and visualization tools for interactive work.
GNU Radio is a free software development toolkit that provides the signal processing runtime and processing blocks to implement software radios using readily-available, low-cost external RF hardware and commodity processors. GNU Radio applications are primarily written using the Python programming language, while the supplied, performance-critical signal processing path is implemented in C++ using processor floating point extensions where available. Thus, the developer is able to implement real-time, high-throughput radio systems in a simple-to-use, rapid-application-development environment. While not primarily a simulation tool, GNU Radio does support development of signal processing algorithms using pre-recorded or generated data, avoiding the need for actual RF hardware.
pysamplerate is a small wrapper for Source Rabbit Code (http://www.mega-nerd.com/SRC/), still incomplete, but which can be used now for high quality resampling of audio signals, even for non-rational ratio.
audiolab is a small library to import data from audio files to numpy arrays, and export numpy arrays to audio files. It uses libsndfile for the IO (http://www.mega-nerd.com/libsndfile/), which means many formats are available, including wav, aiff, HTK format and FLAC, an open source lossless compressed format. Previously known as pyaudio (not to confuse with pyaudio), now part of scikits.
PyWavelets is a user-friendly Python package to compute various kinds of Discrete Wavelet Transform.
PyAudiere is a very flexible and easy to use audio library for Python users. Available methods allow you to read soundfiles of various formats into memory and play them, or stream them if they are large. You can pass sound buffers as NumPy arrays of float32's to play (non-blocking). You can also create pure tones, square waves, or 'on-line' white or pink noise. All of these functions can be utilized concurrently.
CMU Sphinx is a free automatic speech recognition system. The SphinxTrain package for training acoustic models includes Python modules for reading and writing Sphinx-format acoustic feature and HMM parameter files to/from NumPy arrays.
Symbolic math, number theory, etc.
Swiginac: SWIG wrappers around GINAC, a C++ symbolic math library.
NZMATH: NZMATH is a Python based number theory oriented calculation system developed at Tokyo Metropolitan University. It contains routines for factorization, gcd, lattice reduction, factorial, finite fields, and other such goodies. Unfortunately short on documentation, but contains a lot of useful stuff if you can find it.
SAGE: a comprehensive environment with support for research in algebra, geometry and number theory. It wraps existing libraries and provides new ones for elliptic curves, modular forms, linear and non commutative algebra, and a lot more.
SymPy: SymPy is a Python library for symbolic mathematics. It aims to become a full-featured computer algebra system (CAS) while keeping the code as simple as possible in order to be comprehensible and easily extensible. SymPy is written entirely in Python and does not require any external libraries, except optionally for plotting support.
Python bindings for CLNUM: an library which provides exact rationals and arbitrary precision floating point, orders of magnitude faster (and more full-featured) than the Decimal.py module from Python's standard library. From the same site, the ratfun module provides rational function approximations, and rpncalc is a full RPN interactive python-based calculator.
DecInt: a Python class that provides support for operations on very large decimal integers. Conversion to and from the decimal string representation is very fast; the multiplication and division algorithms are asymptotically faster than the native Python ones.
Kayali is a Qt based Computer Algebra System (CAS) written in Python. It is essentially a front end GUI for Maxima and Gnuplot.
- These are just other links which may be very useful to scientists, but which I don't quite know where to categorize, or for which I didn't want to make a single-link category.
PyMat: PyMat exposes the MATLAB engine interface allowing Python programs to start, close, and communicate with a MATLAB engine session. In addition, the package allows transferring matrices to and from an MATLAB workspace. These matrices can be specified as NumPy arrays, allowing a blend between the mathematical capabilities of NumPy and those of MATLAB.
mlabwrap: A high-level Python-to-MATLAB bridge. Instead of opening connections to the MATLAB engine and executing statements, MATLAB functions are exposed as Python functions and complicated structures as proxy objects.
pythoncall: A MATLAB-to-Python bridge. Runs a Python interpreter inside MATLAB, and allows transferring data (matrices etc.) between the Python and Matlab workspaces.
IDL to Numeric/numarray Mapping: a summary mapping between IDL and numarray. Most of the mapping also applies to Numeric.
Pybliographer: a tool for managing bibliographic databases. It can be used for searching, editing, reformatting, etc. In fact, it's a simple framework that provides easy to use python classes and functions, and therefore can be extended to many uses (generating HTML pages according to bibliographic searches, etc). In addition to the scripting environment, a graphical Gnome interface is available. It provides powerful editing capabilities, a nice hierarchical search mechanism, direct insertion of references into LyX and Kile, direct queries on Medline, and more. It currently supports the following file formats: BibTeX, ISI, Medline, Ovid, Refer.
pyreport: runs a script and captures the output (pylab graphics included). Generates a LaTeX or pdf report out of it, including litteral comments and pretty printed code.
Vision Egg: a powerful, flexible, and free way to produce stimuli for vision research experiments.
PsychoPy: a freeware library for vision research experiments (and analyse data) with an emphasis on psychophysics.
PyEPL: the Python Experiment Programing Library. A free library to create experiments ranging from simple display of stimuli and recording of responses (including audio) to the creation of interactive virtual reality environments.
Pythonica: a Python implementation of a symbolic math program, based upon the fantastic precedent set by Mathematica.
Module dependency graph:a few scripts to glue modulefinder.py into graphviz, producing import dependency pictures pretty enough for use as a poster, and containing enough information to be a core part of my process for understanding physical dependencies.
Modular toolkit for Data Processing (MDP): a library to implement data processing elements (nodes) and to combine them into data processing sequences (flows). Already implemented nodes include Principal Component Analysis (PCA), Independent Component Analysis (ICA), Slow Feature Analysis (SFA), and Growing Neural Gas.
FiPy: FiPy is an object oriented, partial differential equation (PDE) solver, written in Python , based on a standard finite volume (FV) approach. The framework has been developed in the Metallurgy Division and Center for Theoretical and Computational Materials Science (CTCMS), in the Material Measurement Laboratory (MML) at the National Institute of Standards and Technology (NIST).
SfePy: SfePy is a finite element solver written in Python, with the time demanding parts implemented in C and interfaced by SWIG. It can be used to solve various problems described by partial differential equations in 2D or 3D, for example the linear elasticity, hyperelasticity, heat conduction, Navier-Stokes, Biot, and other problems. As a research code it is used to implement models derived by the theory of homogenization, with applications in modeling of porous media (for example bones or soft tissue organs) or phononic materials.
Hermes: Hermes is a free C++/Python library for rapid prototyping of adaptive FEM and hp-FEM solvers developed by an open source community around the hp-FEM group at the University of Nevada, Reno.
FEval: FEval is useful for conversion between many finite element file formats. The main functionality is extraction of model data in the physical domain, for example to calculate flow lines.
CSC Climate Wiki: wiki for the Climate Systems Center (CSC) at the University of Chicago. Topics include climate research, the philosophy of modularizing climate models, the use of Python in climate modeling, and software packages produced by CSC. This site contains a lot of useful information about Python for scientific computing.
peak-o-mat: peak-o-mat is a curve fitting program for the spectrocopist. It is especially designed for batch cleaning, conversion and fitting of spectra from visibile optics expriments if you're facing a large number of similar spectra.
SciPyAmazonAmi: Add software you would like installed on a publicly available Amazon EC2 image here
PyCVF: A computer vision and videomining Framework.
CNEMLIB : propose an implementation of CNEM in 2d and 3d.The CNEM is a generalisation for non convex domain of the Natural Element Method. It's a FEM like approach. The main functionalities of CNEMLIB are : i) interpolation of scattered data spread on convex or non convex domains with the Natural Neighbour interpolant (Sibson ) in 2d, and the Natural Neighbour interpolant (Sibson or Laplace) or the linear finite element interpolant over the Delaunay tessellation in 3d. ii) a gradient matrix operator which allows to calculate nodal gradients for scattered data(The approach used is based on the stabilized nodal integration, SCNI).iii) a general assembling tools to construct assembled matrix associated with a weak formulation (heat problem, mechanic problem, hydrodynamic problem, general purpose problem) as such used with the Finite Element Method (FEM).