Race Detection for Web Applications

Author: Farideh Khalili

This survey presents a summary and overview of the three following papers: 1. Effective race detection for event-driven programs. 2. Practical AJAX race detection for JavaScript web applications. 3. Effective static race detection for Java.

Introduction

This research aims to recognize data races caused by the asynchrony of events in web applications. The goal of this research is to minimize false positives.

Definition

A data race is a specific type of race condition that occurs when two or more threads or processes access the same shared variable or memory location concurrently. Moreover, at least one of those accesses is a write operation. In a data race, the order or timing of those accesses is not guaranteed, which can result in unpredictable behavior."

Asynchrony in Web Applications

The heavy use of asynchrony in event-driven programs can increase the likelihood of data race errors. This is particularly true in client-side web applications. Asynchronous operations enable multiple tasks to execute concurrently, potentially accessing shared data. When these operations are not synchronized correctly, it can lead to unpredictable and inconsistent program behavior, resulting in data races.

In event-driven programming, event handlers typically execute atomically in a single thread. This means that handler code can assume that no other handlers execute concurrently. However, event-driven systems allow for events to be fired non-deterministically, which can cause the corresponding event handlers to run in a non-deterministic order. This non-determinism can lead to data races, where two or more threads or processes access the same shared variable or memory location concurrently, resulting in unpredictable behavior. To avoid data races in event-driven programs, programmers must use synchronization mechanisms such as locks, mutexes, or atomic operations to ensure that shared variables are accessed in a safe and consistent manner.

Due to the limited synchronization primitives provided to programmers by the web platform, ad hoc synchronization is often necessary to coordinate event handler execution. This is due to the lack of synchronization primitives available. In other words, shared variables are commonly used by web applications as a means of synchronization, since there are few other options available.

In the field of software testing, detecting and fixing concurrency bugs, such as data races, is crucial for ensuring the reliability and performance of software systems. In this context, three recent papers, "EventRacer: A Predictive Runtime Analysis Tool for Discovering Event Races in Web Applications", "AjaxRacer: Detection of Data Races in Ajax Applications", and “Effective static race detection for Java” propose different methods for identifying data races in web applications. All of these papers employ a concept similar to scheduling of shared memory to analyze and identify potential data races. Additionally, these papers aim to minimize false positives and avoid infeasible race scenarios.

Paper structure

In this summary, I will provide an overview of the three papers and compare their approaches. The first two papers use dynamic analysis in contrast with the last one that uses static analysis. Finally, I will conclude by discussing the strengths and weaknesses of each approach and identifying potential areas for future research.

Background

In computer science, the term "race" can refer to a specific type of programming error known as a "race condition". A race condition occurs when program behavior depends on the sequence or timing of its component parts. However, the order or timing of those parts is not guaranteed by the program's design. This can lead to unpredictable or unintended behavior, such as incorrect output, crashes, or data corruption.

For example, consider a program that involves two threads of execution that access the same shared resource. If the program does not properly synchronize access to that resource, one thread may possibly modify the resource at the same time that the other thread is reading or modifying it. This can result in unexpected behavior, such as the wrong value being read or written to the resource.

To avoid race conditions, programmers must carefully design their programs to ensure that all threads access shared resources in a coordinated and synchronized manner. This can be accomplished using techniques such as locking, semaphores, and other synchronization mechanisms.

Unlike other types of race conditions, data races are a particularly pernicious type of bug because they can be difficult to detect and reproduce. They can lead to subtle, intermittent errors that may only occur under certain conditions or on certain hardware platforms. To avoid data races, programmers must use synchronization mechanisms such as locks, mutexes, or atomic operations to ensure that shared variables are accessed in a safe and consistent manner.

In the context of event-driven programs such as client-side web applications, a data race can occur when multiple event handlers attempt to access or modify the same shared data at the same time. For example, imagine a web application that has two event handlers: one that listens for user input and modifies a global variable, and another that listens for a timer event and reads the same global variable. If the user triggers the input event at the same time that the timer event fires, both event handlers will try to access the global variable simultaneously, potentially causing a data race.

Previous works in the area of race detection for multithreaded programs have had varying levels of success. Some techniques have been found to miss some races, while others have been prone to creating false positives, where some of the identified races were actually infeasible to produce. Attempts to reduce false positives have been made through ad hoc filtering, but these have not been consistently effective. In addition, some research has focused on detecting multithreaded spin locks, but this has required modification of the thread scheduler, which may not be feasible for web applications. These challenges highlight the need for more effective and efficient race detection techniques that can reliably identify true races while minimizing false positives and avoiding the need for complex modifications to the system.

Approach

Detecting data races can be done using either static analysis or dynamic analysis approaches.

Static analysis involves examining the source code or other artifacts of the system without executing it. The goal of static analysis is to identify potential issues, such as coding errors, security vulnerabilities, and other bugs before the software is even executed. Static analysis can be particularly useful in identifying issues that may not be immediately obvious during development or testing. It can also be used to ensure that code conforms to coding standards and other best practices.

Dynamic analysis, on the other hand, involves observing the behavior of the software during execution to identify potential issues. This approach can be done using tools that monitor the execution of the code and identify any issues that may arise during runtime. Dynamic analysis can be effective at identifying issues that occur in specific scenarios, but it can be difficult to reproduce the exact conditions necessary to trigger an issue.

Both static and dynamic analysis approaches have their own strengths and weaknesses. While static analysis can identify potential issues early in the development process, it may not be able to identify all issues. Dynamic analysis can identify issues that occur during runtime, but it may not be able to identify all potential issues. A combination of these approaches is often used to ensure the software is as robust and reliable as possible.

We'll start by discussing the two papers that have chosen the dynamic analysis approach and will give an overview of their methodology and algorithm.

"Effective Race Detection for Event-Driven Programs”

This paper presents a dynamic analysis technique for detecting data races in event-driven programs, which are programs that are driven by external events such as user input or system messages. Event-driven programs are typically written using event handlers or callbacks, which can make it challenging to detect data races using traditional race detection techniques.

Race Coverage

This paper introduces the notion of race coverage. Race a covers race b iff treating a as synchronization eliminates b as a race. This will then help to prune the overwhelming number of races detected by previous tools, as covered races are actually false positives. The evaluation of this paper indicated that the number of uncovered data races are 14x smaller on average than the set of all variables with races.

Event

The paper also introduces a language, “Event”. This language cleanly models essential concepts necessary to identify data races (read, write, begin, end, fork, join). Read and write are obvious. Begin and End denote the beginning and the ending of an event action. At any time, up to one event action can be executing, that is, event action execution is atomic.

What could create a data race, is what should the next event action scheduled be. All reads, writes and forks always happen between begin and end. When event action t, forks event action u, the execution of operation in t stops until event action u is executed. join(t, u) denotes that action t must wait until another event action u completes. An event action represents event handler execution for each user click on the button.

Vector Clock

Moreover, they present a dynamic analysis algorithm that efficiently computes races in event-driven programs by keeping the width of their vector clocks much smaller than the standard approach.

A vector clock is essentially a data structure that associates each node in a distributed system with a timestamp or logical clock value. Each node maintains its own vector clock, which is updated whenever an event occurs on that node. When nodes communicate with each other, they exchange their vector clocks to determine the relative ordering of events across the system.

In the context of event-driven programs, vector clocks can be used to track the ordering of events that occur on different threads or processes. By associating each event with a timestamp based on the vector clock, it is possible to detect data races and ensure that shared variables are accessed in a safe and consistent manner.

Technique

The paper describes a technique for using vector clocks to detect data races in event-driven programs by analyzing the ordering of events across multiple threads or processes.

The paper proposes the use of an event ordering mechanism that enables the detection of data races by tracking the relative order of events across multiple threads or processes. The event ordering mechanism is based on the concept of vector clocks, which are used to timestamp events and determine their relative order.

The technique involves identifying potential race conditions by looking for shared variables that are accessed by multiple threads or processes without proper synchronization. The ordering of events is then analyzed to determine if there is a potential data race.

The execution of a program is defined via traces, where a finite trace is a sequence of operations in their language. For a given trace π, if operation a occurs before operation b in π, they say that a <π b, assuming that each operation appears only once in the trace. Similarly, they say that event action t precedes event action u in the trace, denoted t <π u (they overload the operator), if begin(t) <π begin(u). Let races(π) denote the set of all races which occur in a trace π. An uncovered race is one for which no combination of races in the trace π cover it. A race R = (a, b) in a trace π is accessible, if there exists a trace π′ ∈ [π], such that b ∈ π′, but a ̸∈ π′. For a trace π, if R ∈ uncovered(π), then R is an accessible race, which means any race that is reported as uncovered is guaranteed to “exist” for some trace. This result is useful as it tells us that if I do not find a race in a given trace, then it means that there are no races for the other traces in [π]. ## "Practical AJAX race detection for JavaScript web applications" This paper recognizes the nondeterministic scheduling of events, specifically targeting AJAX events, as the root cause of data races in asynchronous client-server communication. In their approach they combine dynamic analysis with controlled execution to target identification of harmful AJAX event races. ### AJAX AJAX (Asynchronous JavaScript and XML) is a web development technique used to create interactive and responsive web applications. AJAX combines multiple technologies such as HTML, CSS, JavaScript, and XML/JSON to allow web pages to asynchronously communicate with a web server, enabling updates to web page content without requiring a full page reload. This allows for a smoother and more responsive user experience, as users can interact with the web application without waiting for a page to fully load. ### Controlled Execution Controlled execution is a technique used in software testing and analysis to observe and monitor the behavior of a program during runtime. The idea is to instrument the program code to track the execution of specific events or actions and record the data produced during execution. This allows developers and analysts to observe the behavior of the program in a controlled manner. In the context of this paper, controlled execution involves instrumenting a web application to track the order of events and identify shared variables that are accessed without proper synchronization. This instrumentation allows developers and analysts to monitor the behavior of the application as it runs, providing insights into potential race conditions and other concurrency-related bugs that may arise. ### Validation According to the authors, they observed that JavaScript developers typically test their code on fast and reliable networks and servers, which leads to the AJAX requests and responses being effectively synchronous during testing. This led them to define "expected" event schedules where an AJAX response event handler (eresp) executes immediately after the event handler (ereq) that sent the request. Any schedule where another event handler (e) is scheduled between ereq and eresp is considered less likely to occur during normal testing. The authors use this observation to detect AJAX event races, where the effects of e conflict with the effects of eresp.

In this context, schedules where AJAX is processed synchronously define the normal expected behavior, and adverse conditions are situations where the network or server is slow or unreliable allowing other events to interfere.

Event Types

The paper introduces a distinction between user events and system events, such as AJAX response events or timer events. System events are triggered directly or indirectly by user events after the webpage is loaded, thus each system event can be traced back to a user event. To describe this relationship, the term "derived from a user event" is employed.

Event Controller

They utilize an event controller mechanism inspired by EventRaceCommander to manage the scheduling of event handlers during test execution. This approach restricts nondeterminism by selectively delaying the execution of event handlers. To establish consistency in their terminology, they adopt the concept of a schedule, borrowed from concurrency in multi-threaded systems. A schedule refers to the fixed nondeterministic choices associated with a specific sequence of user events.

Technique

They use a two-phase approach. In the first phase, a dynamic monitoring mechanism is employed to identify conflicting AJAX response event handlers and to determine which event handlers can be reordered. This phase aims to identify user event handlers that may conflict with AJAX response event handlers.

In the second phase, event graphs are utilized to plan a series of tests. For each test, two event schedules are simulated, one with synchronous AJAX and another that simulates adverse network conditions. Any observable differences are then reported along with detailed information about the event schedules that resulted in them.

AjaxRacer generates a trace τu by monitoring the execution of each user event u and its associated system events. The trace is a sequence of operations that includes three kinds of actions: fork, join, and mutate-dom. The fork operation represents the creation of a new system event w of kind k by event v, to be dispatched at a later time. The join operation specifies that event w cannot occur before event v. The mutate-dom operation models the modification of the HTML DOM by event v, where the parameters x, y, w, and h specify the position and size of the affected bounding box on the screen.

By tracking the sequence of operations for each user event and its associated system events, AjaxRacer generates a detailed trace that captures the execution behavior of the web application. This trace includes information about the creation of new system events, the order in which events occur, and modifications to the HTML DOM. The fork operation indicates the creation of a new system event, while the join operation specifies that a particular system event cannot occur before a given event. The mutate-dom operation captures modifications made to the HTML DOM by an event, specifying the position and size of the affected bounding box on the screen. Together, these operations enable AjaxRacer to accurately capture the execution behavior of the web application and detect event race errors.

In contrast to EventRacer's concept of event actions, AjaxRacer generates a trace for each user event, as opposed to a global trace. Additionally, AjaxRacer uses a different model of memory accesses, which takes into account the effects of HTML DOM write operations on screen pixels, instead of low-level read/write operations. An event graph Gu = (N , E, l) is generated from each event trace that is triggered by a user event u. The nodes in the graph represent events, which are either the initial user event or the system events derived from it, the user event u is referred to as the (unique) root of Gu. The graph is directed, edges are generated for each fork or join operation, and each edge in the graph represents a constraint on the order of events.

Once the event graphs are extracted and potential conflicts are identified, I proceed to Phase 2, however, to minimize false positives and generate informative errors, Phase 2 aims to intentionally trigger observable race errors. A test is created for each pair of user events (ui, uj) that have the potential to cause AJAX conflicts. The test involves comparing the resulting webpage when the events are executed in a synchronized order versus when they are executed in an adverse situation. If there is a discrepancy between the two web pages, an error is reported. These comparisons are solely based on the comparison of the resulting webpages.

Results

When applied on a same test subject, this approach identified almost half the number of uncovered races that EventRacer had identified.

"Effective static race detection for Java"

While both papers I discussed earlier focus on dynamic analysis, Naik's paper proposes a static analysis approach to detecting data races in Java programs. Static analysis approaches work by analyzing the code without actually executing it, whereas dynamic approaches observe the behavior of the code during execution.

Technique

The race detection algorithm consists of four stages and four static analyses. The initial over-approximation of the pairs of memory accesses potentially involved in a race is refined successively in the following stages: reachable pairs, aliasing pairs, escaping pairs, and unlocked pairs.

To support these stages, the authors employ four static analyses: call-graph construction, alias analysis, thread-escape analysis, and lock analysis. The call-graph and alias analysis are mutually dependent and are computed simultaneously. Additionally, the thread-escape and lock analyses rely on both the call-graph and alias analyses.

They have noted that achieving precision in alias analysis requires k-object sensitivity, which is essential to their approach as all their other static analyses depend on it. However, using k-object sensitivity comes at a significant scalability cost, even for small values of k, such as k = 3. Most existing implementations of k-object sensitive alias analysis fail to handle k = 1 on many benchmarks due to memory limitations. Nevertheless, recent research has shown that Binary Decision Diagrams (BDDs) can scale whole-program context-sensitive analyses effectively. Therefore, the authors built Chord, a tool that utilizes BDDs for efficient and precise k-object sensitive alias analysis.

The authors' race detection algorithm is context-sensitive but flow-insensitive. Lack of flow sensitivity primarily affects the types of synchronization idioms their algorithm can handle. They can handle three idioms: lexically-scoped, lock-based synchronization, fork/join synchronization, and wait/notify synchronization. Lack of flow sensitivity helps scalability but harms precision. However, their approach is tailored to strike a balance between precision and scalability by focusing on context-sensitivity.

The call-graph and alias analysis are mutually dependent and are computed simultaneously. The thread-escape and lock analyses depend on both the call-graph and alias analyses. Overall, their race detection algorithm is an iterative refinement process that aims to reduce false positives while detecting true races.

Synchronization in Java

Java uses the synchronized (l) {...} construct to encourage programmers to acquire and release locks in a lexically-scoped manner. This construct is well-suited to flow-insensitive analysis.

Although lexically-scoped, lock-based synchronization is the main synchronization idiom, Java programs also use fork/join synchronization, which requires flow-sensitive analysis. However, the authors' approach lacks flow sensitivity, so they rely on annotations that indicate which threads cannot execute in parallel and which fields/objects cannot be accessed simultaneously by different threads.

The wait, notify, and notifyAll constructs have little effect on race detection. Before calling any of these methods on an object, a thread must first acquire a lock on that object. Calling wait on an object causes the calling thread to release the lock it acquired on that object before the call and then block until it reacquires the lock. Calling notify or notifyAll does not release any lock. Therefore, handling Java's wait/notify synchronization reduces to handling lock-based synchronization from the perspective of race detection.

Lastly, the lack of flow sensitivity in their race detection approach results in sacrificing soundness. The reason is that to determine whether a pair of accesses is guarded by a common lock, a must-alias analysis is necessary, which is inherently flow sensitive. However, their lock analysis relies on their may alias analysis, which is an unsound approximation.

Experiments

The authors have developed a tool called Chord, which is tested on an object-relational mapping system benchmark called JdbF to detect race conditions in multi-threaded programs. Chord generates a single-threaded harness that simulates a multi-threaded environment and presents a graph showing the paths in the program's call graph leading to the accesses of the field.

The approach employed by Chord has two important features: analyzing open programs and reporting counterexamples. This is particularly relevant for developers of multi-threaded applications that interact with the environment through an interface, as it allows them to detect race conditions before deployment. Chord's algorithm analyzes pairs of accesses to shared variables and determines if they can occur simultaneously by two different threads, making it capable of performing whole-program analysis and detecting races in open programs.

The first stage, reachable-pairs computation, computes an initial over-approximation of the set of pairs of memory accesses that could be involved in a race. Then, the algorithm uses four static analyses: call-graph construction, alias analysis, thread-escape analysis, and lock analysis, to prune the initial set of pairs and reduce the number of pairs to be checked for race conditions.

The second stage, aliasing-pairs computation, uses alias analysis to identify pairs of accesses that refer to the same memory location, and therefore, should not race.

The third stage, escaping-pairs computation, identifies objects that may be thread-shared using a thread-escape analysis. The algorithm retains a pair of accesses only if they access thread-shared data and reference the same memory location.

The final stage, unlocked-pairs computation, prunes pairs of accesses that may be involved in a data race by considering whether or not they are executed by threads that hold a common lock.

Limitations

However, the algorithm has some limitations. For example, it approximates the lock held on a variable by its singleton points-to set and only considers common locks at an abstract object level. To address these limitations, the algorithm uses an approximation method that is nearly complete and computationally efficient. Moreover, due to the lack of flow sensitivity in the approach, the set ReachablePairs computed in the first stage may still contain some false positives, which are removed in later stages of the algorithm.

Input

To use the algorithm, the user provides an open or closed program and a set of interfaces. The algorithm then synthesizes a harness that simulates many scenarios in which the program's environment might exercise its interface. The harness is single-threaded, and the race detection algorithm simulates executing each pair of calls in separate threads on shared data. The algorithm identifies data races by analyzing pairs of accesses to shared variables and determining if they can occur at the same time by two different threads. The user can also provide annotations per field or class to exclude pairs referencing those fields or classes.

Output

The tool finds pairs of data accesses that could cause a race condition and presents them as a graph. However, not all of these pairs actually result in race conditions, and it can be challenging to determine which ones are real. The tool categorizes the pairs of data accesses based on the field or abstract object referred to and provides two views to identify which pairs are false alarms or related to the same bug. The passage also provides an example of how the tool reports a race condition involving two threads accessing the same field of an object in a specific method.

Soundness

The approach has four sources of unsoundness, which means that it may not be able to detect some races: 1. The approach may not be able to detect all races because it uses a type of analysis that is not precise enough.

  1. The approach may not be able to detect races in some parts of the program because it does not analyze them accurately.

  2. The approach may not check all potential races because it skips over some parts of the program that usually do not have races but may still cause false alarms.

  3. The approach does not take into account some features of the program that can make it harder to detect races.

Results

For evaluation, they tested their race detection algorithm on a suite of open-source Java programs, both closed and open-source. The evaluation uses various annotations to determine whether threads can execute in parallel and whether fields/objects can be accessed simultaneously by different threads. In order to analyze the results, I should look at the time taken to run the tool, the number of annotations provided, and the number of harmful races, benign races, and false alarms detected.

The tsp program creates a main thread and an array of worker threads, and the TspSolver class is shared by all worker threads. The experiment reported 7 harmful races, 0 benign races, and 12 false alarms. The hedc program was also analyzed and reported 4 harmful races, 2 benign races, and 13 false positives. The ftp program had 45 harmful races, 3 benign races, and 23 false positives. The vect and htbl programs were also analyzed and reported 5 harmful races in vect 1.1 and 9 benign races in htbl 1.4.

Some fixes were needed to eliminate all harmful races. This includies adding synchronization to a piece of code, extending an existing synchronized block, changing the object on which the lock was held by a synchronized block, declaring a field volatile, or removing synchronization.

Compared to the two papers I discussed earlier, Naik's approach is more lightweight and does not require executing the program. However, it is limited in that it cannot detect all possible data races, since it relies on a soundness assumption about the locking behavior of the program. Additionally, it may report more false positives than the dynamic approaches since it does not have access to runtime information about the program's behavior.

Conclusion

The three papers I have discussed and used different methods for identifying data races in software systems. Effective Static Race Detection for Java focuses on static analysis, while Practical AJAX Race Detection for JavaScript Web Applications and Effective Race Detection for Event-Driven Programs use dynamic analysis techniques.

In terms of strengths, Effective Static Race Detection for Java performs well in detecting data races and is able to handle large codebases efficiently. Practical AJAX Race Detection for JavaScript Web Applications is effective in detecting data races in web applications and provides detailed error reports to help developers identify and fix issues quickly. Effective Race Detection for Event-Driven Programs is particularly good at detecting data races in event-driven programs and is able to handle complex asynchronous operations effectively.

However, each method has its weaknesses as well. Effective Static Race Detection for Java can produce false positives and may miss some data races in multi-threaded programs. Practical AJAX Race Detection for JavaScript Web Applications can be slow and may not handle some complex web application scenarios effectively. Effective Race Detection for Event-Driven Programs can generate many false positives and may miss some data races in complex programs.

Future research in this area could focus on combining static and dynamic analysis techniques to improve the accuracy and efficiency of data race detection. Additionally, there may be opportunities to develop specialized detection techniques for specific types of software systems, such as mobile apps or distributed systems. Overall, the field of data race detection is an important area for ongoing research and development, as it plays a crucial role in ensuring the reliability and performance of software systems in multi-threaded environments.