How many tools have you used for interactive data visualization based on Python?

01 Data visualization in Python

Now most of the data visualization research is carried out through D3. Unfortunately, I only have 8 weeks to spend with students, so I can only focus on teaching a combination of theory and practice to help them become data scientists.

Although students are happy to use visualization techniques to explore and explain problems, most of them are less interested in using D3 to create beautiful custom visualizations. According to the feedback from the professor who taught this course before, it is impossible to teach D3 in such a short time.

In view of my own love of Python and the comfortable experience that Python brings to students, I decided to introduce them to the magical (I hope so!) software packages in Python that can achieve all the content I show to students.

02 Seaborn's static visualization

Given my past experience using seaborn, I am very happy to be able to introduce students to the beautiful visual patterns produced by seaborn. They already have experience using matplotlib, so it is easy to learn seaborn and the advantages are huge.

Students can make scatter plots (bivariate and multivariate), swarmplots, violin plots, bar graphs, box plots and histograms with facets. They learned that using large data sets to generate swarmplots is very time-consuming, and summary-based plots (such as violin plots) are a better choice.

03 Realize interactive visualization with Bokeh or Plot.ly

Although seaborn can produce beautiful visualizations, they are all static. I want students to experience the benefits of using interactive techniques such as combing, filtering, zooming and hovering. To this end, I introduced the visualization libraries Bokeh and Plot.ly , which can be used to easily realize interactive data visualization.

For the visual distribution of time series, students can choose to use Bokeh or plot.ly to implement multi line charts, heatmaps, animated bubble charts, etc.

04 Visualizing trees, graphs and networks

When discussing the technology of hierarchical data visualization, I was happy to show the tree diagram visualization technology and compare it with the node link graph. Unfortunately, when I dig deeper, I did not find a way to implement a multi-level tree diagram. Even after importing the squarify library, you can only generate a first-level tree diagram in Python!

The wonderful networkx software package can be used to analyze graphs and networks. However, network visualization can only be achieved with matplotlib or igraph or plotly (see the tutorial on using plotly to achieve network visualization). igraph has many different options to help users try to configure the graph, but it is very inconvenient to set up, so many students have encountered problems when using it. On the other hand, plot.ly works smoothly, but there are few options for customizing the network diagram.

05 Earth science visualization

Given that the creation of interactive maps is an important part of data visualization, I am interested in finding the ability to create choropleth maps, symbol maps, cartograms, transit maps and even flow maps. (flow maps) packages are more confident. The following are the geoscience visualization libraries I found in Python:

Plot.ly allows you to create iso-area plots and symbol plots, but you can hardly control the process of creating plots.
Geoplotlib is a small and easy-to-use package. It is built on pyglet, but it is a bit unstable and often crashes. It uses OpenStreetMap tiles and even allows visualization of spatial data based on animation. I like this package very much because it contains some concise and easy-to-use examples.
Geoplot looks perfect, and there are some great examples in it, but neither I nor our students can install it. Given that most of us do not use conda, we should heed this warning-"Please use caution because this may not work on Windows and may not run on OSX and Linux."
Cartopy and geopandas+matplotlib only generate static visualizations, All I haven't tried.

06 text visualization

We have learned a lot about various text visualization technologies, such as tag clouds (such as wordle), docubursts, parallel tag clouds, phrase nets and word trees ( word trees), and also introduced topic exploration and emotion visualization techniques.

Unfortunately, apart from the word_cloud package, there are few options for people who want to visualize a single document or large text set in Python.

07 Interactive data visualization on the Web

Currently, Bokeh and Plot.ly Dash are the main options for creating interactive dashboards that allow multi-view selection and filtering. Bokeh has very few examples, and Plot.ly Dash is very important for users who are used to creating visualizations in Python.

Plot.ly Dash is built on Flask, Plotly.js and React.js, and at the same time increases the barriers to creating simultaneous multi-viewpoint visualization. Some student teams in my class use Plot.ly Dash to complete the final project, but they learn very fast. The following link is a simple case of Ryan Campa and Shikhar Gupta using Dash to visualize the TED talk data set.

http://campa-gupta.herokuapp.com/

08 Will Altar be the ideal choice?

As the course progressed, some news about the combination of Python and Vega became Altair! I am pleased to learn that the Vega I am using comes from UW Interactive Data Lab. Jim Vallandingam's excellent "Introduction to Altair" tutorial is a good starting point.

Jake VanderPlas, the main developer of Altair, recently posted links to his Python notebook and PyCon 2018 videos. I have been playing it since then and I like it very much! Data scientists want to explore their data and create visualizations to interpret them internally and externally. I hope it can meet the needs of data scientists.

09 Summary

Data scientists like to use visualization libraries and packages in Python, and I hope that tools like Altair are the ultimate approach. Software packages such as plotly, seaborn, bokeh, geoplotlib will continue to develop and have more features. Interactive data visualization (for the Web) through Python will have a brighter future, and we look forward to this day!

Guess you like

Origin blog.csdn.net/weixin_44999079/article/details/95023378