Thursday Complexity Post

October 17, 2013

Messy Big Data Needs New Math

Huge quantities of complex, messy, multi-dimensional data gathered from biological and human social systems, collections that lack the formal structure that might have existed had data been accumulated to examine a specific question, are challenging to analyze. And such data sets are burgeoning in multiple fields, from medical records, genomic sequencing, and neural networks in the brain and to the social networks in human life.

A story by Jennifer Ouellette in Quanta Magazine explains that today's big data is "noisy, unstructured and dynamic," sometimes corrupted and sometimes incomplete, and that a wide range of mathematical tools and techniques are needed to make sense of it. Yale mathematician Ronald Coifman asserts that we need a "big data equivalent of a Newtonian revolution, on a par with the seventeenth century invention of calculus." He believes new techniques developing in modern math will help identify and make visible the underlying structures of big data sets.

In an article in the Santa Fe New Mexican, Simon DeDeo, a research fellow in applied mathematics and complex systems at the Santa Fe Institute, suggests that the computer revolution is aiding the discovery of some universal principles hidden in massive data. For example, he says, the mathematical models that describe the conflict and cooperating in editing contentious Wikipedia entries are remarkably similar to models based on the outbreak and resolution of wars among ancient Greek city states. He and colleagues are now looking at the U.S. government shut down to determine whether that conflict can be modeled using the same math.

The Quanta story tells how DeDeo analyzed 300 years worth of data from the archives of Old Bailey, the criminal court of England and Wales. He used spreadsheets to record information from nearly 200,000 trials, which included charge, verdict and sentence, and transcripts containing 10 million words. Using text recognition, he sifted through the words, grouping them unto 1,000 categories. "Now you've turned the trial into a 1,000 dimensional space that tells you how much the trial is about friendship, or trust, or clothing," he told Quanta.

In his New Mexican article, DeDeo writes that he and collaborators saw ideals of modern justice and fairness evolving from a harsh medieval world. In the 1600s, he writes, "incorrigible pickpockets" were sentenced to die; in the 1700s people convicted of violent and nonviolent crimes met similar fates and were described in similar language. Over the next 150 years, data shows growing recognition that murder and rape differ from petty theft and fraud and should be treated differently, a dramatically important social shift.

Gunnar Carlsson, a mathematician at Stanford University, studies cumbersome complex data using topological data analysis (TDA). Carlsson says TDA is a way of getting structured data out of unstructured data, so that machine learning, a set of techniques to construct and study systems that can learn from data, will work on it. Watch Carlsson's short YouTube lecture. The seeds of TDA and modern network theory go back to the Seven Bridges of Konigsberg, a math problem popular in the eighteenth century, Ouellette writes. The challenge asks whether a person can travel to and from each of four separate land areas, crossing each of seven connecting bridges only once. The mathematician Leonhard Euler realized distances and positions didn't matter, but the number of land masses-the nodes-and how the bridges connected them-the links or edges-did.

Map of Konigsberg in Euler's time showing the actual layout of the seven bridges, highlighting the river Pregel. Wikipedia

Carlsson says huge, raw data sets with many dimensions can be mathematically compressed into lower-dimension structures that show primary regions and how they are connected.

Carlsson developed technology, which he offers through his company Ayasdi that can produce maps visualizing compressed representations of huge data sets. For instance, the Quanta story says, data from a breast cancer study was initially recorded on spreadsheets with 1,500 columns and 272 rows representing differing genomic samples from patients. When the data was transformed by TDA into a network, the map took the shape of a Y. Patients who died were clustered on the left branch, and a smaller number who survived were on the right branch, allowing geneticists to study factors that influence survival. Read the Quanta story here and DeDeo's article here. Read Ouellette's Quanta piece on quantum computers, machine learning and big data, in Wired Magazine, here.

Remember PlexusCalls!

PlexusCalls

Friday, October 25, 1-2 PM ET

Buildings Designed for Beauty, Life and Work

Guests: Robert Peck, Thomas Lockwood and Sharon Benjamin

Are humans hard wired to seek beauty in the places where they live and work? Does the design and aesthetic appeal of the workplace influence performance? Three leaders experienced in businesses and organizations who also understand design share their views.

Robert Peck is Southeast Region Director of Consulting for Gensler, a global architecture and design firm. His group helps clients with workplace strategy, occupancy management and sustainable design. He is a nationally recognized advocate for high quality public architecture, smart growth and sustainable design. For five years in the Clinton Administration and nearly three years in the Obama Administration, Bob was Commissioner of the U.S, General Services Administration's Public Building Service, eventually managing a budget of more than $9 billion and a workforce of 7,000. Thomas Lockwood is an international expert in design leadership and integrating design and innovation into business. He holds a rare PhD in design management, and is a passionate advocate for the value of design to the triple bottom line by improving economic social and environmental well being. Lockwood is the co-editor of four books: The Handbook of Design Management (2011), Design Thinking (2010), Corporate Creativity (2008) and Building Design Strategy (2007). Sharon Benjamin, PhD is principal of Alchemy, a Washington, D.C. based management consulting practice. She consults with multi-lateral, NGO and healthcare organizations. An adjunct at NYU, she teaches the leadership capstone course for MPA students. Her work supports leaders seeking to effect profound transformation -- within themselves and their organizations pioneering innovative methods such as Positive Deviance. Read their complete bios.

Nursing Network PlexusCalls

Wednesday, November 6, 1-2 PM ET

Complexity in Education, Leadership, Research, & Innovation

Guests: Gail Mitchell and Nadine Cross

This session will include a conversation between the two speakers about their involvement in five different projects that integrate complexity thinking with: education (pedagogy and eLearning), patient safety (Seeing the Forest), health coaching (developing and supporting 5 RNHCs in communities with persons living with diabetes and dementia), leadership and metaphor, and research-based drama on relationality and dementia. Each area will be started with a critical question and at least one understanding that emerges from the work about complexity thinking and its applicability. Guest speakers invite participants and listeners to participate in the conversation by sharing what ideas or what projects resonate with them in their work and how they might build on the ideas or take the ideas in a different direction.

Read their complete bios.

Audio from all PlexusCall series are available by searching the iTunes store for plexuscalls. Or, visit plexusinstitute.org under Resources/Call Series.

Plexus Institute

1025 Connecticut Ave, NW Ste 1000

Washington, DC 20036

Phone: 888-466-4884

[email protected]

www.plexusinstitute.org

...fostering the health of individuals,

families, communities, organizations,

and our natural environment by helping people

use concepts emerging from the new

science of complexity

Join Plexus