Skip to Main Content

NIMBioS



|  Announcements  |  Calendar  |  VisitorInfo  |  Sitemap  |  Contact  |

Working Group: Biological Problems using Binary Matrices

 
Working group photo.
(L to R): Daniel B. Stouffer, Robert Dorazio, Richard Barker, Diego Vazquez, Steven Schwager, Stefano Allesina, Nicholas Gotelli, Joshua Ladau, Steven Kembel; (Not Pictured): Jennifer Dunne, Dan Simberloff, Edward Connor.
 

Tackling a Math Problem for Ecology

Binary photo.
Many ecological data can be summarized in binary matrices, for instance the distribution of aquatic organisms in lakes in the Sierra Nevada Mountains (above). A major goal of the Binary Matrices NIMBioS Working Group is to develop improved methods for inferring large-scale effects of ecological mechanisms from this type of data.

For years, ecologists have struggled to answer some fundamental questions in ecology using certain statistical tests. Trying to measure the distribution of a species over time or geographical space, for example, poses a challenge, as does studying food webs and pollinator networks or investigating how ecological communities are comprised.

Because at large scales such questions often cannot be answered by conducting an experiment, they are explored by making inferences from observational data, specifically binary matrices.

Ecologists turn to a statistical method called null model testing to examine binary matrices. Yet, the null model testing of binary matrices has proved problematic.

In May, a team of biologists, statisticians, and mathematicians gathered at a meeting at the National Institute for Mathematical and Biological Synthesis (NIMBioS) in Knoxville, Tenn. to tackle the statistical issue and work toward developing a mathematical solution.

"The idea of using a mathematical framework for developing optimal methods is new," said Joshua Ladau, co-organizer of the NIMBioS Working Group and postdoctoral fellow at the Gladstone Institutes at the University of California San Francisco.

Ladau believes one of the chief problems of using null model tests is that they’ve been developed on the basis of intuition, which can lead to conflicting and unreliable results.

The goal of the NIMBioS working group is to develop an overarching mathematical framework that will guide the development and application of null model tests.

"We hope doing this work will help find answers to fundamental ecological questions," Ladau said. "One of the roadblocks to finding those answers is the shortcomings of the current methodologies."

Bringing researchers together from different disciplines will hopefully lead to new insights and in turn generate new ecological questions, Ladau said.

The National Institute for Mathematical and Biological Synthesis (NIMBioS) brings together researchers from around the world to collaborate across disciplinary boundaries to investigate solutions to basic and applied problems in the life sciences. NIMBioS is funded by the National Science Foundation in collaboration with the U.S. Department of Homeland Security and the U.S. Department of Agriculture, with additional support from The University of Tennessee, Knoxville.

For more information, contact Catherine Crawley at 865-974-9350 or ccrawley@nimbios.org

Bookmark and Share


NIMBioS Working Group: Biological Problems using Binary Matrices

Topic: Working Group on Biological Problems using Binary Matrices

Meeting dates: May 26-29, December 10-13, 2009; May 4-7, 2010

Organizers: Edward F. Connor (Department of Biology, San Francisco State University, San Francisco, CA); Joshua Ladau (Gladstone Institutes, GICD, San Francisco, CA)

Objectives: Many fundamental questions in ecology cannot be addressed experimentally because at the relevant large spatial and temporal scales, experimentation is impractical, unethical, or impossible. Instead, to investigate these questions inferences must be made from observational data. Null model testing comprises a key tool for making these inferences, allowing large-scale effects of processes such as environmental filtering, competition, and facilitation to be inferred from observations of species ranges, abundance distributions, body sizes, and other similar traits. Three types of ecological data that are commonly analyzed using null models include binary presence-absence matrices, which give the distribution of species over a set of sites; ecological networks such as food webs and pollinator networks; and phylogenetic patterns in community composition. All of these data can be coded in a binary form.

The Binary Matrices working group will focus on null model tests of binary data, with a particular emphasis on the aforementioned examples. A key problem with null model tests is that they are generally developed and justified based on intuition. However, multiple tests can all seem intuitively appropriate for the same data, yet yield conflicting conclusions. Hence, a pressing issue is developing and implementing an overarching mathematical framework to guide the development and application of null model tests. One such framework is optimality; for instance consideration of methods that have minimal Type II error rates subject to controlled Type I error rates. Further application of the optimality framework is possible, and the development and application other types of guiding frameworks are worth considering.

To implement null model tests developed in an optimality framework, it is often necessary to simulate random quantities from non-standard probability distributions. For instance, simulating presence-absence matrices from a uniform distribution over the set of binary matrices with the observed marginal totals is necessary for implementing methods that control Type I error rates under a relatively broad class of statistical models. The algorithmic challenges associated with these simulations can be formidable, and require the development of unbiased Markov Chain Monte Carlo algorithms and analysis of their mixing times and convergence properties.

The development of null model tests for species co-occurrence, ecological networks, and community phylogenetics have all relied on the assumption that the presence-absence matrix, food web, or community phylogeny is fully known. In the context of species co-occurrence this has usually been articulated as assuming that the probability of detecting a species if it is present is equal to 1. However, with limited sampling, differences in species’ abundance, and differences in species’ behaviors, some species are easier to detect than others rendering such an assumption suspect at best. A separate literature has developed arising from statistical inference based on mark-recapture-release data, that has begun to examine community patterns relaxing the assumption that detection probability = 1. Further application of these methods to problems of multi-species co-occurrence patterns (number of species > 2) or to other community ecological patterns is ripe for development. Focusing generally on these three issues, this working group will aim to foster the investigation and development of solutions to these problems.

The goals of the Binary Matrices Working group are to bring together biologists, statisticians, and mathematicians to address these and other related issues to improve quantitative inference from binary data in biology.


NIMBioS Working Group on Binary Matrices:
Summary of Meeting 1, May 26-29, 2009

Participants: Stefano Allesina (National Center for Ecological Analysis and Synthesis); Richard Barker (Univ. of Otago); Robert Dorazio (USGS); Nicholas J. Gotelli (Univ. of Vermont); Steven W. Kembel (Univ. of Oregon); Joshua Ladau (Univ. of California); Steven J. Schwager (Cornell Univ.); Daniel Simberloff (Univ. of Tennessee); Daniel B. Stouffer (Estación Biológica de Doñana); Diego Vázquez (Instituto Argentino de Investigaciones de las Zonas Aridas)

The May 2009 meeting of the working group began with presentations that provided an overview of the existing analyses of binary matrices in biology. The presentations fostered extensive discussion on the challenges facing the analysis of binary matrices and approaches for addressing these challenges. Following the presentations, the working group broke into subgroups to develop specific research projects on the analysis of binary matrices. The subgroups identified four areas in which improved analyses are strongly needed:
  - the analysis of food webs,
  - pollination networks,
  - incidence-based co-occurrence patterns, and
  - abundance-based co-occurrence patterns.

The working group also developed a general strategy for developing improved analyses by combining ideas from ecology and mathematical statistics.  The working group aims to complete four papers by the end of 2009 and meets again in December 2009.


NIMBioS Working Group on Binary Matrices:
Summary of Meeting 2, Dec 10-13, 2009

Participants: Stefano Allesina (National Center for Ecological Analysis and Synthesis); Richard Barker (Univ. of Otago); Edward F. Connor (San Francisco State Univ.); Robert Dorazio (USGS); William Godsoe (NIMBioS); Nicholas J. Gotelli (Univ. of Vermont); Joshua Ladau (Univ of California); Steven J. Schwager (Cornell Univ.); Daniel Simberloff (Univ. of Tennessee); Daniel B. Stouffer (Estación Biológica de Doñana); Diego Vázquez (Instituto Argentino de Investigaciones de las Zonas Aridas).

The meeting began with presentations from the four subgroups (the analysis of food webs; pollination networks; incidence-based co-occurrence patterns; and abundance-based co-occurrence patterns) and discussions about the progress that had been made since the last meeting. Following the presentations, the subgroups worked to further their projects. Substantial progress was made acquiring data sets for analysis, coding statistical methods, and discussing data-related matters and models. The next meeting for the group is scheduled for May 2010.


NIMBioS Working Group on Binary Matrices:
Summary of Meeting 3, May 4-7, 2010

Participants: Stefano Allesina (National Center for Ecological Analysis and Synthesis); Richard Barker (Univ. of Otago); Edward F. Connor (San Francisco State Univ.); Robert Dorazio (USGS); William Godsoe (NIMBioS); Joshua Ladau (Univ of California); Steven J. Schwager (Cornell Univ.); Daniel Simberloff (Univ. of Tennessee); Daniel B. Stouffer (Estación Biológica de Doñana); Diego Vázquez (Instituto Argentino de Investigaciones de las Zonas Aridas).

The third meeting began with discussions led by the four subgroups of the Working Group, i.e., analysis of food webs, pollination networks, incidence-based co-occurrence patterns, and abundance-based co-occurrence patterns. Currently, the food web subgroup is focusing on detecting universal patterns of trophic interactions. The pollination network subgroup is refining models and biological hypotheses. The incidence-based co-occurrence subgroup is focusing on characterizing the Plackett-Luce model of community assembly, while the abundance-based co-occurrence subgroup is finishing analyses and writing a paper incorporating abundance data into examination of species co-occurrence patterns. The Working Group aims to complete four papers by the end of 2010, one on each of the four sub-areas. The next meeting is scheduled for December 2010.


NIMBioS Working Groups are chosen to focus on major scientific questions at the interface between biology and mathematics. NIMBioS is particularly interested in questions that integrate diverse fields, require synthesis at multiple scales, and/or make use of or require development of new mathematical/computational approaches. NIMBioS Working Groups are relatively small (10-12 participants, with a maximum of 15), focus on a well-defined topic, and have well-defined goals and metrics of success. Working Groups will typically meet 2-4 times over a two-year period, with each meeting lasting 3-5 days; however, the number of participants, number of meetings, and duration of each meeting is flexible, depending on the needs and goals of the Group.