ACM DL

ACM Journal of

Experimental Algorithmics (JEA)

Menu
Latest Articles

KADABRA is an ADaptive Algorithm for Betweenness via Random Approximation

We present KADABRA, a new algorithm to approximate betweenness centrality in directed and undirected graphs, which significantly outperforms all... (more)

Better External Memory LCP Array Construction

The suffix array, perhaps the most important data structure in modern string processing, needs to be augmented with the longest-common-prefix (LCP) array in many applications. Their construction is often a major bottleneck, especially when the data is too big for internal memory. We describe two new algorithms for computing the LCP array from the... (more)

BlockQuicksort: Avoiding Branch Mispredictions in Quicksort

It is well known that Quicksort -- which is commonly considered as one of the fastest in-place sorting algorithms -- suffers in an essential way from branch mispredictions. We present a novel approach to addressing this problem by partially decoupling control from dataflow: in order to perform the partitioning, we split the input into... (more)

SimBa: An Efficient Tool for Approximating Rips-filtration Persistence via <underline>Sim</underline>plicial <underline>Ba</underline>tch Collapse

In topological data analysis, a point cloud data P extracted from a metric space is often analyzed by computing the persistence diagram or barcodes of a sequence of Rips complexes built on P indexed by a scale parameter. Unfortunately, even for input of moderate size, the size of the Rips complex may become prohibitively large as the scale... (more)

Fully Dynamic 2-Hop Cover Labeling

The 2-hop Cover labeling of a graph is currently the best data structure for answering shortest-path distance queries on large-scale networks, since it combines low query times, affordable space occupancy, and reasonable preprocessing effort. Its main limit resides in not being suited for dynamic networks since, after a network change, (1) queries... (more)

Strong Steiner Tree Approximations in Practice

In this experimental study, we consider Steiner tree approximation algorithms that guarantee a constant approximation ratio smaller than 2. The considered greedy algorithms and approaches based on linear programming involve the incorporation of k-restricted full components for some k ≥ 3. For most of the algorithms, their strongest theoretical... (more)

Linear and Efficient String Matching Algorithms Based on Weak Factor Recognition

We present a simple and very efficient algorithm for string matching based on the combination of weak factor recognition and hashing. Despite its... (more)

Solving Graph Problems via Potential Maximal Cliques: An Experimental Evaluation of the Bouchitté--Todinca Algorithm

The BT algorithm of Bouchitté and Todinca based on enumerating potential maximal cliques, originally proposed for the treewidth and minimum... (more)

Automated Congressional Redistricting

Every 10 years, when states are forced to redraw their congressional districts, the process is intensely partisan, and the outcome is rarely fair and democratic. In the past few decades, the growing capabilities of computers have offered the promise of objective, computerized redistricting. Unfortunately, the redistricting problem can be shown to... (more)

NEWS

JEA joins RCR: Replicated Computational Results Initiative

http://jea.acm.org/rcr_initiative.cfm

In Memoriam: David S. Johnson

http://dl.acm.org/citation.cfm?id=2907073

About JEA

The Journal of Experimental Algorithmics (ISSN 1084-6654) is a high-quality journal devoted to the study of discrete algorithms and data structures from an empirical perspective. The journal welcomes original submissions that focus on design, implementation, and performance evaluation through a combination of experimentation and classical techniques.

read more
Forthcoming Articles

Geometric Heuristics for Rectilinear Crossing Minimization

New Exact and Heuristic Algorithms for Graph Automorphism Group and Graph Isomorphism

Abstract. Five new algorithms, named Vsep, are described. Four of them are for determining the generators, orbits and order of the graph automorphism group. The fifth algorithm is for finding an isomorphism between two graphs. A basic tool of these algorithms is an iterative adjacency refinement procedure that produces finer output partition from a given input partition of graph vertices. All nonequivalent discrete partitions derived from the selected vertices are stored. This is the main difference of the exact version with the known algorithms for graph automorphism group. A new strategy is used in the exact algorithm: if during its execution some of the searched or intermediate variables obtain a wrong value then the algorithm continues from a new start point losing some of the results determined so far. The new start point is such that the correct results are obtained. The proposed algorithms have been tested on well-known benchmark graphs and compared with one of the most competitive known tool for the worst cases (Traces). The results show that for some graph families the exact Vsep algorithm outperforms Traces, and vice-versa for some of the others. The heuristic versions use reduced search trees. They are almost exact and are many times faster than the exact one. The heuristic algorithms are a good choice for the user because of their smaller running times. The experiments show that the running time of Vsep algorithms does not depend on the vertex labeling.

Adaptive Cuckoo Filters

We introduce the adaptive cuckoo filter (ACF), a data structure for approximate set membership that extends cuckoo filters by reacting to false positives, removing them for future queries. As an example application, in packet processing queries may correspond to flow identifiers, so a search for an element is likely to be followed by repeated searches for that element. Removing false positives can therefore significantly lower the false positive rate. The ACF, like the cuckoo filter, uses a cuckoo hash table to store fingerprints. We allow fingerprint entries to be changed in response to a false positive in a manner designed to minimize the effect on the performance of the filter. We show that the ACF is able to significantly reduce the false positive rate by presenting both a theoretical model for the false positive rate and simulations using both synthetic data sets and real packet traces.

Listing All Maximal k-Plexes in Temporal Graphs

Many real-world networks evolve over time, that is, new contacts appear and old contacts may disappear. They can be modeled as temporal graphs where interactions between vertices (which represent people in the case of social networks) are represented by time-stamped edges. One of the most fundamental problems in (social) network analysis is community detection, and one of the most basic primitives to model a community is a clique. Addressing the problem of finding communities in temporal networks, Viard et al. [TCS 2016] introduced ?-cliques as a natural temporal version of cliques. Himmel et al. [SNAM 2017] showed how to adapt the well-known Bron-Kerbosch algorithm to enumerate ?-cliques. We continue this work and improve and extend the algorithm of Himmel et al. to enumerate temporal k-plexes (notably, cliques are the special case k=1). We define a ?-k-plex as a set of vertices with a lifetime, where during the lifetime each vertex has in each consecutive ?+1 time steps edges to all but at most k-1 vertices in the chosen set of vertices. We develop a recursive algorithm for enumerating all maximal ?-k-plexes and perform experiments on real-world social networks that demonstrate the practical feasibility of our approach. In particular, for the special case of ?-1-plexes (that is, ?-cliques), we observe that our algorithm is on average significantly faster than the previous algorithms by Himmel et al. [SNAM 2017] and Viard et al. [IPL 2018] for enumerating ?-cliques.

Network Flow-Based Refinement for Multilevel Hypergraph Partitioning

We present a refinement framework for multilevel hypergraph partitioning that uses max-flow computations on pairs of blocks to improve the solution quality of a k-way partition. The framework generalizes the flow-based improvement algorithm of KaFFPa from graphs to hypergraphs and is integrated into the hypergraph partitioner KaHyPar. By reducing the size of hypergraph flow networks, improving the flow model used in KaFFPa, and developing techniques to improve the running time of our algorithm, we obtain a partitioner that computes the best solutions for a wide range of benchmark hypergraphs from different application areas for both the connectivity and the cut-net metric while still having a running time comparable to that of hMetis. In the case of graph partitioning, our algorithm compares favorably with KaFFPa, even after enhancing the latter with our improved flow network, and at the same time is more than a factor of two faster. Finally, we show that our algorithm improves the performance of the memetic multilevel hypergraph partitioner KaHyPar-E.

Evaluating and Tuning n-fold Integer Programming

In recent years, algorithmic breakthroughs in stringology, computational social choice, scheduling, etc., were achieved by applying so-called n-fold integer programming. An n-fold integer program (IP) has a highly uniform block structured constraint matrix. Hemmecke et al. [Math. Programming, 2013] showed an algorithm with runtime \Delta^O(rst+r^2s)n^3, where \Delta is the largest coefficient, r,s, and t are dimensions of blocks of the constraint matrix and n is the total dimension of the IP; thus, an algorithm efficient if the blocks are of small size and with small coefficients. The algorithm works by iteratively improving a feasible solution with augmenting steps, and n-fold IPs have the property that augmenting steps are guaranteed to exist in a not-too-large neighborhood. We have implemented the algorithm and learned the following. The original algorithm is practically unusable, but a series of improvements make its evaluation possible. Crucially, a certain constant in the algorithm can be treated as a tuning parameter, yielding an efficient heuristic. Furthermore, the algorithm uses an overly expensive strategy to find a "best" step, while finding only an "approximatelly best" step is much cheaper, yet sufficient for quick convergence. Using this insight, we improve the asymptotic dependence on n from n^3 to n^2log(n). We show that decreasing the tuning parameter initially leads to an increased number of iterations needed for optimality and eventually to local optima, as expected. However, surprisingly small values of the parameter are often enough. Second, our new strategy for finding "approximatelly best" steps wildly outperforms the original construction.

Fully-Dynamic Graph Algorithms Inspired by Distributed Computing: Deterministic Maximal Matching and Edge Coloring in Sublinear Update-Time

We study dynamic graphs in the fully-dynamic {centralized} setting. In this setting the vertex set of size $n$ of a graph $G$ is fixed, and the edge set changes step-by-step, such that each step either adds or removes an edge. Dynamic graphs have various applications in fields such as Communication Networks, Computer Graphics and VLSI Design. The goal in this setting is maintaining a solution to a certain problem (e.g., maximal matching, edge coloring) after each step, such that each step is executed efficiently. The running time of a step is called {\em update-time}. Prior to the current work, for several central problems, the best known deterministic algorithms for general graphs were the naive ones with update-time O(n). The question of existence of sublinear in $n$ update-time deterministic algorithms remained wide open. In this paper we address this question by devising sublinear update-time deterministic algorithms for maximal matching in graphs with bounded neighborhood independence o(n/ \log^2 n)}, and for proper O(\Delta)-edge-coloring in {general graphs}. The family of graphs with bounded neighborhood independence is a very wide family of dense graphs, and represent very well various types of networks. In order to obtain our results we employ a novel approach that adapts certain distributed algorithms of the { LOCAL} setting to the centralized fully-dynamic setting. This is achieved by optimizing the work each processors performs, and efficiently simulating a distributed algorithm in a centralized setting. Our experiments on various network topologies and scenarios demonstrate that our algorithms are highly-efficient in practice.

Comparing two clusterings using matchings between clusters of clusters

Clustering is a fundamental problem in data science, yet, the variety of clustering methods and their sensitivity to parameters make clustering hard. To analyze the stability of a given clustering algorithm while varying its parameters, and to compare clusters yielded by different algorithms, several comparison schemes based on matchings, information theory and various indices (Rand, Jaccard) have been developed. We go beyond these by providing a novel class of methods computing meta-clusters within each clustering? a meta-cluster is a group of clusters, together with a matching between these. Let the intersection graph of two clusterings be the edge-weighted bipartite graph in which the nodes represent the clusters, the edges represent the non empty intersection between two clusters, and the weight of an edge is the number of common items. We introduce the so-called D-family-matching problem on intersection graphs, with D the upper-bound on the diameter of the graph induced by the clusters of any meta-cluster. First we prove NP-completeness and APX-hardness results, and unbounded approximation ratio of simple strategies. Second, we design exact polynomial time dynamic programming algorithms for some classes of graphs (in particular trees). Then, we prove spanning-tree based efficient algorithms for general graphs. Our experiments illustrate the role of D as a scale parameter providing information on the relationship between clusters within a clustering and in-between two clusterings. They also show the advantages of our built-in mapping over classical cluster comparison measures such as the variation of information (VI).

All ACM Journals | See Full Journal Index

Search JEA
enter search term and/or author name