# OpenModeller Maxent

### From OpenBio Wiki

## Introduction

The first implementation of a Maximum Entropy algorithm in openModeller (OM-MAXENT) was released in the 0.6 version in 2008. This implementation was based on an existing third-party library and the resulting models did not have the same quality as those produced by other algorithms. Since then, the algorithm was re-written based on a Matlab code provided by S. Phillips, and many other versions were later released. Most of this work was performed by Elisângela Rodrigues as part of her Doctorate. The resulting models improved considerably comparing with the first version, but unfortunately they were still different when compared with the Maxent software.

This page describes the activity under OpenBio to produce yet another version of OM-MAXENT, but this time compatible with the original Maxent software. Compatibility here means acceptable differences, as we are dealing with a complex algorithm being implemented by different software. It is also important to note that not all functionality available from Maxent was implemented in this new version - in particular the jackknifing tool, the collecting bias input and the possibility of using categorical maps are currently not available in OM-MAXENT, as Use Case 2 only needs linear, quadratic, product, threshold, hinge and autofeatures parameters. Therefore, compatibility considering only these parameters was the focus of the new version.

## Methods

The strategy for achieving compatibility involved the following steps: 1) Finding a way to guarantee that OM-MAXENT and Maxent are run with exactly the same input; 2) Defining criteria to compare the output produced by the two implementations, including logs and maps; 3) Making successive adjustments to the openModeller code; 4) Comparing the results until an acceptable difference is reached. Being a deterministic algorithm, the task of comparing results was greatly facilitated.

The first step required using the same algorithm parameters, input points, environmental layers and background points. A standard experiment based on the example data provided by openModeller was created: 65 presence points of *Thalurania furcata boliviana* and 2 environmental layers (rain_coolest and temp_avg). The layers had to be converted into the ArcInfo ASCII Grid format supported by Maxent. Ten thousand background points were previously sampled and the corresponding environmental values were retrieved with om_sampler. To make sure that both software used the same background points, they had to be provided to openModeller as absence points (through the parameter "Use absences as background") and they had to be provided to Maxent with the parameter "-e".

The following criteria were used for comparing results:

- Correlation (r) between the produced maps.
- Number of iterations.
- Proportion of matching best features selected after each iteration.
- Final loss .

The next step was to change OM-MAXENT so that both software produced a similar log (actually OM-MAXENT had to produce a more verbose log to help identifying potential issues).

The Maxent version used in this work was 3.3.3e, and the Java VM version was 1.7.0.03

Compatibility was achieved in three stages:

- Level 1: linear, quadratic and product features (
~~Trac ticket #29~~) - Level 2: threshold and hinge features (
~~Trac ticket #30~~) - Level 3: autofeatures (
~~Trac ticket #31)~~

The following differences were in principle considered acceptable for this work: minimum correlation of 0.98, maximum difference of 3 iterations, at least 90% matching features, and a maximum difference of 0.01 in the final loss.

## Results

Level 1 compatibility was achieved in revision #5468 (openModeller Subversion repository).

Level 2 compatibility was achieved in revision #5506 (openModeller Subversion repository).

Level 3 compatibility was achieved in revision #5519 (openModeller Subversion repository).

Additional adjustments were made after that, so the latest revision recommended for tests is #5529.

Autofeatures OFF | ||||||
---|---|---|---|---|---|---|

Parameters | Number of iterations | Proportion of matching best features | Difference in final loss | Correlation between the maps | Observations | |

Maxent | OM-MAXENT | |||||

linear | 60 | 60 | 98.33% | 0.0 | 1 | only the last two best features didn't match due to differences in the 15th decimal place of the delta loss bound. |

linear quadratic | 201 | 201 | 100% | 0.0 | 1 | |

linear product | 60 | 60 | 100% | 0.0 | 0.9999983 | |

linear quadratic product | 180 | 180 | 100% | 0.0 | 1 | |

linear threshold | 161 | 161 | 100% | 0.0 | 1 | |

linear quadratic threshold | 420 | 420 | 100% | 0.0 | 1 | |

linear product threshold | 141 | 141 | 100% | 0.0 | 1 | |

linear quadratic product threshold | 161 | 161 | 100% | 0.0 | 1 | |

linear hinge | 420 | 420 | 100% | 0.0 | 1 | |

linear quadratic hinge | 441 | 441 | 100% | 0.0 | 1 | |

linear product hinge | 341 | 341 | 100% | 0.0 | 1 | |

linear threshold hinge | 420 | 420 | 100% | 0.0 | 1 | |

linear quadratic threshold hinge | 500 | 500 | 100% | 0.0 | 1 | |

linear product threshold hinge | 341 | 341 | 100% | 0.0 | 1 | |

linear quadratic product hinge | 341 | 341 | 100% | 0.0 | 1 | |

linear quadratic product threshold hinge | 341 | 341 | 100% | 0.0 | 1 |

Autofeatures ON | ||||||
---|---|---|---|---|---|---|

Parameters | Number of iterations | Proportion of matching best features | Difference in final loss | Correlation between the maps | Observations | |

Maxent | OM-MAXENT | |||||

linear | 60 | 60 | 98.33% | 0.0 | 1 | |

linear quadratic | 201 | 201 | 100% | 0.0 | 1 | |

linear product | 60 | 60 | 98.33% | 0.0 | 1 | |

linear quadratic product | 201 | 201 | 100% | 0.0 | 1 | |

linear threshold | 60 | 60 | 98.33% | 0.0 | 1 | |

linear quadratic threshold | 201 | 201 | 100% | 0.0 | 1 | |

linear product threshold | 60 | 60 | 98.33% | 0.0 | 1 | |

linear quadratic product threshold | 201 | 201 | 100% | 0.0 | 1 | |

linear hinge | 420 | 420 | 100% | 0.0 | 1 | |

linear quadratic hinge | 441 | 441 | 100% | 0.0 | 1 | |

linear product hinge | 420 | 420 | 100% | 0.0 | 1 | |

linear threshold hinge | 420 | 420 | 100% | 0.0 | 1 | |

linear quadratic threshold hinge | 441 | 441 | 100% | 0.0 | 1 | |

linear product threshold hinge | 420 | 420 | 100% | 0.0 | 1 | |

linear quadratic product hinge | 441 | 441 | 100% | 0.0 | 1 | |

linear quadratic product threshold hinge | 341 | 341 | 100% | 0.0 | 1 |

The same tests were performed with different numbers of input points as this can influence results, especially when autofeatures is activated. The following number of points were used in the extra tests: 10, 13, 15, 30, 60, 80 and 90. All tests resulted in correlations greater than 0.99, matching features greater than 90% and differences of no more than 2 iterations (these extra tests were performed with Java VM 1.5, when there's usually a greater difference in the number of iterations and matching features).