
Working Group: Biological Problems Using Binary Matrices
(L to R): Daniel B. Stouffer, Robert Dorazio, Richard Barker, Diego Vazquez, Steven Schwager, Stefano Allesina, Nicholas Gotelli, Joshua Ladau, Steven Kembel; (Not Pictured): Jennifer Dunne, Dan Simberloff, Edward Connor. 
Topic: Working Group on Biological Problems Using Binary Matrices
Meeting dates: May 2629, December 1013, 2009; May 47, 2010; December 1417, 2010
Organizers:
Edward Connor
(Dept. of Biology, San Francisco State Univ., San Francisco, CA);
Josh Ladau
(Gladstone Institutes, GICD, San Francisco, CA)
Objectives: Many fundamental questions in ecology cannot be addressed experimentally because at the relevant large spatial and temporal scales, experimentation is impractical, unethical, or impossible. Instead, to investigate these questions inferences must be made from observational data. Null model testing comprises a key tool for making these inferences, allowing largescale effects of processes such as environmental filtering, competition, and facilitation to be inferred from observations of species ranges, abundance distributions, body sizes, and other similar traits. Three types of ecological data that are commonly analyzed using null models include binary presenceabsence matrices, which give the distribution of species over a set of sites; ecological networks such as food webs and pollinator networks; and phylogenetic patterns in community composition. All of these data can be coded in a binary form.
The Binary Matrices working group focused on null model tests of binary data, with a particular emphasis on the aforementioned examples. A key problem with null model tests is that they are generally developed and justified based on intuition. However, multiple tests can all seem intuitively appropriate for the same data, yet yield conflicting conclusions. Hence, a pressing issue is developing and implementing an overarching mathematical framework to guide the development and application of null model tests. One such framework is optimality; for instance consideration of methods that have minimal Type II error rates subject to controlled Type I error rates. Further application of the optimality framework is possible, and the development and application other types of guiding frameworks are worth considering.
To implement null model tests developed in an optimality framework, it is often necessary to simulate random quantities from nonstandard probability distributions. For instance, simulating presenceabsence matrices from a uniform distribution over the set of binary matrices with the observed marginal totals is necessary for implementing methods that control Type I error rates under a relatively broad class of statistical models. The algorithmic challenges associated with these simulations can be formidable, and require the development of unbiased Markov Chain Monte Carlo algorithms and analysis of their mixing times and convergence properties.
The development of null model tests for species cooccurrence, ecological networks, and community phylogenetics have all relied on the assumption that the presenceabsence matrix, food web, or community phylogeny is fully known. In the context of species cooccurrence this has usually been articulated as assuming that the probability of detecting a species if it is present is equal to 1. However, with limited sampling, differences in species’ abundance, and differences in species’ behaviors, some species are easier to detect than others rendering such an assumption suspect at best. A separate literature has developed arising from statistical inference based on markrecapturerelease data, that has begun to examine community patterns relaxing the assumption that detection probability = 1. Further application of these methods to problems of multispecies cooccurrence patterns (number of species > 2) or to other community ecological patterns is ripe for development. Focusing generally on these three issues, this working group will aim to foster the investigation and development of solutions to these problems.
The goals of the Binary Matrices Working group were to bring together biologists, statisticians, and mathematicians to address these and other related issues to improve quantitative inference from binary data in biology.
Meeting Summaries for NIMBioS Working Group:
Biological Problems Using Binary Matrices
Meeting 1: May 2629, 2009  Agenda (PDF)  Participants  Evaluation report (PDF) 
Meeting 1 summary. The May 2009 meeting of the working group began with presentations that provided an overview of the existing analyses of binary matrices in biology. The presentations fostered extensive discussion on the challenges facing the analysis of binary matrices and approaches for addressing these challenges. Following the presentations, the working group broke into subgroups to develop specific research projects on the analysis of binary matrices. The subgroups identified four areas in which improved analyses are strongly needed:
 the analysis of food webs,
 pollination networks,
 incidencebased cooccurrence patterns, and
 abundancebased cooccurrence patterns.
The working group also developed a general strategy for developing improved analyses by combining ideas from ecology and mathematical statistics. The working group aims to complete four papers by the end of 2009 and meets again in December 2009.
Meeting 2: Dec 1013, 2009  Agenda (PDF)  Participants  Evaluation report (PDF) 
Meeting 2 summary. The meeting began with presentations from the four subgroups (the analysis of food webs; pollination networks; incidencebased cooccurrence patterns; and abundancebased cooccurrence patterns) and discussions about the progress that had been made since the last meeting. Following the presentations, the subgroups worked to further their projects. Substantial progress was made acquiring data sets for analysis, coding statistical methods, and discussing datarelated matters and models. The next meeting for the group is scheduled for May 2010.
Meeting 3: May 47, 2010  Agenda (PDF)  Participants  Evaluation report (PDF) 
Meeting 3 summary. The third meeting began with discussions led by the four subgroups of the Working Group, i.e., analysis of food webs, pollination networks, incidencebased cooccurrence patterns, and abundancebased cooccurrence patterns. Currently, the food web subgroup is focusing on detecting universal patterns of trophic interactions. The pollination network subgroup is refining models and biological hypotheses. The incidencebased cooccurrence subgroup is focusing on characterizing the PlackettLuce model of community assembly, while the abundancebased cooccurrence subgroup is finishing analyses and writing a paper incorporating abundance data into examination of species cooccurrence patterns. The Working Group aims to complete four papers by the end of 2010, one on each of the four subareas. The next meeting is scheduled for December 2010.
Meeting 4: Dec 1417, 2010  Agenda (PDF)  Participants  Evaluation report (PDF) 
Meeting 4 summary. The final meeting began with discussions led by the four subgroups of the Working Group, i.e., analysis of food webs, pollination networks, incidencebased cooccurrence patterns, and abundancebased cooccurrence patterns. The discussions focused on outlining the progress that had been made since the last meeting and delineating the work remaining on the papers that the working group is writing. Following the presentations, the subgroups worked to further their projects. The food web subgroup focused on detecting universal patterns of trophic interactions. In the pollination network subgroup, work focused on refining models and biological hypotheses. The incidencebased cooccurrence subgroup focused on applying loglinear models to the problem of community assembly and applying the PlackettLuce model to data on the colonization process. And in the abundancebased cooccurrence subgroup, work focused on finishing analyses and writing. The abundancegroup also discussed extensions of the hierarchical model they were developing to address other problems such as food webs, pollinator networks, and community phylogenetics. The working group is aiming to complete four papers by the end of 2011, one on each of the four subareas. Currently, each participant is working on two to three of these papers. There are no further meetings scheduled for the Binary Matrices working group. However, three group members will meet in San Francisco in April 20011 to continue collaborations (Dorazio, Ladau, and Connor), and three group members (Dorazio, Allesina, and Ladau) will present papers at the International Environmetrics Society (TIES) meeting in July 2011 on research performed as part of the working group.
NIMBioS Working Groups are chosen to focus on major scientific questions at the interface between biology and mathematics. NIMBioS is particularly interested in questions that integrate diverse fields, require synthesis at multiple scales, and/or make use of or require development of new mathematical/computational approaches. NIMBioS Working Groups are relatively small (1012 participants), focus on a welldefined topic, and have welldefined goals and metrics of success. Working Groups will typically meet 24 times over a twoyear period, with each meeting lasting 35 days; however, the number of participants, number of meetings, and duration of each meeting is flexible, depending on the needs and goals of the Group.