A survey on automated methods to detect vulnerabilities in protocol implementations Author: Anqi Chen

Introduction

Official specifications define the expected behaviors of protocols, but concrete implementations in different languages and for various platforms may introduce flaws, which can take different forms: like the control logic of the state machine abstracted from the protocol is problematic, or the implementation is faulty due to developers' carelessness, or the interactions with different platforms cause trouble.

When a protocol system has been developed, the sound next step is to think about how to test it: whether it's correct, robust, resilient, efficient, etc. Developers can manually design test cases, and launch specific attacks against the target protocol system, which might be tedious but still not comprehensive. So researchers turn their eyes to developing automated methods to detect the vulnerabilities in protocol implementations and much interesting work has been done in this area. This survey paper will focus on the following 3 papers on this topic:

First, a summary of the problem that each of the three papers was intended to address, the methods proposed and implemented, and the experiments and results derived. A comparison of the three papers is then discussed, along with some insights and observations that may inspire researchers to make progress in the field.

Protocol State Fuzzing of TLS Implementations

Introduction

The paper is published in Usenix Security 2015, which used black-box testing in state machine learning to infer state machines from different TLS protocol implementations, then the authors investigate the state machines to look for spurious behavior, which might be exploitable vulnerabilities. In three TLS implementations, the authors found new security flaws, and they also pointed out that this work could be applied for fingerprinting purposes since the unique state machine is generated in a different implementation.

Background

TLS and attacks on it

TLS is a protocol serving for secure network connections, which experienced different versions and naming changes. Subprotocols inside TLS are:

The handshake protocol to establish session keys and optionally authenticate the server and / or client;
The changecipher protocol consisting only one message to show the start of encrypted messages using the session key;
The alert messages used to indicate different faults, with alert level and one-byte description.

Besides, extensions may be added to the protocol, for example, the Heartbeat extension to deal with the Heartbleed attack.

And the attacks targeted on it can be divided into several categories, like the cryptographic attacks which find vulnerabilities in hash functions, the timing attacks, and implementation flaws which both may introduce severe security risks (for example, the Heartbleed attack).

Protocol State Fuzzing

Previous work may focus on abstract descriptions of TLS using formal methods, but many security problems may root in mistakes in implementations. Then formally verified TLS implementations are proposed. However, current fuzzing works mainly are limited to single messages like certificates. And this paper's technique might help guide the fuzzing direction serving that purpose.

The technique protocol state fuzzing means that different sequences of messages are sent to black-box testing and the state machine is automatically extracted. As to related work in state machine learning, two kinds of methods exist: passive learning and active learning. In the former one, message traffic is observed to infer the state machine; while in the latter one, the implementation is actively queried by the learning algorithm, and the model is constructed based on the responses.

The state machine analysis may be done by hand or model checker. One example of analyzed vulnerability is that security-sensitive processing actions are taken before the authentication step is done.

This work only relied on manual analysis of state machines to detect flaws.

Method

LearnLib

The tool this paper used in state machine learning is LearnLib which requires the input of (1) a list of messages it can send to the SUT(the input alphabet); (2) a command to reset the SUT to the initial state. With the sequences to send and the reset command, LearnLib comes up with a hypothesis for the inferred state machine, which is later checked to see if having counterexamples, if a counterexample exists, LearnLib uses it to refine the hypothesis. If no counterexample is found, then the hypothesis is returned to be the estimated model of the implementation.

Improved Chow's W-method

The checking part is a form of model-based testing, and the paper proposed an improved version of Chow's W-method. The W-method is guaranteed to be correct given an upper bound for the number of states. The authors specified the depth of search in LearnLib, and the upper bound of the W-method is set to the number of found states plus the specified depth, which means that only counterexample traces within that upper bound will be explored. They modify W-method by eliminating the search area where Connection Close has been resulted, to mitigate the high performance cost caused by W-method.

Test Harness

Test harness, which means the method to translate between abstract messages in the input alphabet and concrete messages to be sent to the SUT, can be different when a server or client implementation is tested.

The work used different input alphabets when testing a client or server implementation, and support all the regular TLS messages as well as the Heartbeat Extension. Besides, they created some special symbols to correspond to the exceptions: (1) Empty (no data is received from the SUT before a timeout occurs); (2) Decryption failed (after the ChangeCipherSpec message); (3) Connection closed (socket exception happens). And since cryptographic operations exist in the protocol, some states are introduced in the harness to keep track of the information used in key exchange, the actual keys resulting, and whether the ChangeCipherSpec was received or not.

Experiments and results

In experiment and result part, nine different implementations are analyzed. They used demo client and server applications that came with the different implementations except with Jave Secure Socket Extension (JSSE) for which they wrote simple server and client applications. The paper includes the models inside which the authors found the security issues (GnuTLS, JSSE, OpenSSL), as well as the model of RSA BSAFE for Java to illustrate how simple the state machine can be.

Given a model, they first manually analyze if there are more paths than expected that can lead to a successful exchange of application data, which may indicate a security flaw. Also, the authors determine whether the model contains more states than necessary or unexpected transitions. Starting from the unexpected behaviors in the model, the authors perform an in-depth analysis to find the root and severity of the vulnerabilities.

One observation is that all the models of server-side implementations are very different, which they believed to be a useful way to do fingerprinting, i.e., identify the implementation a particular server is using.

Then particularities of the learned models are discussed in detail.

GnuTLS

The GnuTLS server and client both have the problem of mistakenly dealing with the heartbeat message: upon receiving the heartbeat message, a buffer is reset, which previously is used to collect all handshake messages to compute the hash when the handshake is finished. So once it is reset, that will cause the mismatch between hashes computed by the server and the client. To exploit the flaw, the attacker should be able to reset the buffer at the same time on the server and client side, which even though is difficult itself, still exposed the security risk.

JSSE Java

For JSSE Java version 1.8.0_25, the authors observed several paths leading to a successful exchange of application data and more states than expected, from which they detected the security issues. One path with flaws is that despite the server does not receive the ChangeCipherSpec message from the client, it still responds with a ChangeCipherSpec message to a plaintext message sent from the client. A similar problem will be encountered when negotiating new keys, where the client can keep using old keys while the server will start to use new keys. If client messages in plaintext can be accepted, then anyone can tamper with it and cause problems. When the authors analyzed the client, the same issue is also confirmed there, and it can be concluded that the confidentiality and integrity of the data being exchanged can be fully corrupted now.

OpenSSL

When testing with OpenSSL, they found that the Heartbeat Request message sent during the handshake will be saved and only responded to after the handshake phase was finished. So they have to eliminate it from the input alphabet to avoid infinite models.

One vulnerability is detected around an additional ChangeCipherSpec message, after which all messages are responded to with a "Bad record MAC" alert indicating that wrong keys are being used. After the additional ChangeCipherSpec message, a ClientHello message is expected, followed by a ServerHello message. However, OpenSSL does change the server keys but does not use the new randoms from the ClientHello and SeverHello to compute new keys, instead, old keys are used.

Automated Attack Discovery in TCP Congestion Control Using a Model-guided Approach

Introduction

The paper is published in NDSS 2018 with the CISCO Network Security Distinguished Paper Award, and the target protocol is a specific part of TCP protocol, i.e., Congestion Control. Previous work to find the attacks only manually analyze the vulnerabilities and design certain attacks, and also may be limited to implementation languages or platforms. This work's contribution is that it designs and implements a method to automatically explore the possible attacks on congestion control, not limited by a certain language or platform.

In detail, the authors turned their eyes to the FSM(finite state machine) which can be abstracted from the protocol specification and naturally holds much information about the inner logic of the protocol. The authors are inspired by the model-based testing method and first derive some abstract attack strategies by traversing the FSM with a certain target algorithm. And then the abstract strategies are converted into concrete actions by mapping with domain knowledge, including how to craft the packets, in which control congestion state the packet should be inserted. The authors create a test experiment and used their technique to successfully generate 6585 strategies and 2436 attacks.

Background

Congestion Control

Congestion control hopes to achieve two goals: (1)efficient delivery based on network conditions; and (2) fairness with respect to other TCP flows in the network, by regulating the sender to send at an adaptive rate corresponding to the current network congestion state and the receiver's processing state.

So if congestion control is manipulated by an attacker, like deliberately fooling the sender to send at a faster or slower rate than it should do, application level performance may be affected negatively. That may further cause a denial of service or video stream quality to degrade, leading to economical damage or other bad effects.

One of the most important message categories is the Acknowledgement package. If an acknowledgment package is sent many times, it will indicate a packet loss or network congestion, and perhaps cause the sender to slower its sending rate by some scale.

The work focuses on Acknowledgement-based attacks.

TCP attacks

Since TCP traffic is not protected by an encryption mechanism, but instead only "sequence number" is used to protect the integrity of messages, this paper assumes that attackers can sniff the traffic and get sequence numbers as well as the data that has been acknowledged or sent. The attackers can therefore modify or spoof packages and insert them at an appropriate time to launch the attack.

The attackers can be classified into two classes: (1) on-path attackers can intercept the package, modify and insert it back into the data stream; or craft new acknowledgment packages and insert them into the data stream; (2) off-path attackers can spoof packages and insert them into the data stream. On-path attackers can act as a switch on the transporting path between the client and server while off-path attackers can act by observing the traffic on the client's local network. One important difference between on-path and off-path attackers is that off-path attackers cannot prevent the correct acknowledgment packages from reaching the receiver.

Method

Abstract the FSM

The work abstracts the finite state machine based on a classic TCP protocol TCP New Reno. In the paper, there is a state transition graph that clearly specifies the 4 states, transition conditions, and corresponding actions.

Some important terms that are used inside the model include (1) 2 events: ACK and RTO Timeout; (2) 5 variables: cwnd, ssthresh, dupACKctr, high_water, rto_timeout.

Acknowledgments(ACK) are byte-based and cumulative, which means that the receiver acknowledges the highest byte of the data at which all prior data has been received. So a duplicate acknowledgment or 3 duplicate acknowledgments are used to signal the current network congestion information.

RTO timeout means that data is outstanding without any acknowledgment received for several Round-Trip-Times (RTTs), which indicates an even more severe network congestion situation since the last acknowledgment. The timer is started when new data is sent, reset on every acknowledgment and stopped when all data has been acknowledged.

Congestion Window(cwnd) represents the number of bytes that TCP is allowed to have in the network at any time, which can be manipulated by TCP to adapt to different network conditions.

Slow Start Threshold (ssthresh) indicates the value of cwnd at which TCP switches to a congestion avoidance state.

Duplicate ACK(dupACKctr) tracks the number of duplicate acknowledgments received in slow start and congestion avoidance, and if 3 duplicate acknowledgments are received, the state will be switched into fast recovery.

The highest sequence sent (high_water) records the highest sequence number sent prior to entering fast recovery, and only when it has been acknowledged or a timeout occurred will fast recovery be exited.

RTO timeout (rto_timeout) indicates the current length of the RTO timeout, and if the RTO timer expires, this value will be doubled, resulting in an exponential backoff state.

The control state can be classified into 4 phases: (1) slow start; (2) congestion avoidance; (3)fast recovery; (4) exponential backoff.

Slow start means that in the phase, TCP will rapidly increase the sending rate, trying to quickly utilize the available bandwidth of the path while not overloading the network. On each acknowledgment, cwnd is incremented by MSS(Maximum Segment Size), resulting in a doubling of the sending rate every RTT. TCP exits slow start only when timeout or packet loss if the ssthresh is set as MAX_INT.

In congestion avoidance, TCP is sending close to the estimate of the available bandwidth while also slowly probing the additional bandwidth, and the state is only exited on timeout or receiving three duplicate acknowledgments.

Fast recovery means that during this phase, TCP is recovering from a lost packet indicated by three duplicate acknowledgments, so it halves cwnd to cut the sending rate and retransmits the lost packet, setting ssthresh as the new value of cwnd to provide an approximate estimation in case of a timeout. Fast recovery is only exited when the last packet sent in high_water is acknowledged or when an RTO timeout occurs.

Derive abstract attack strategies

With the FSM, abstract attack strategies can be derived and guide malicious concrete actions.

The attacks have two categories of expected effects: (1) decreasing throughput and (2) increasing throughput; where in the former case, the sender is fooled to slow down the sending rate; and in the latter case vice versa.

One example is called "Blind Throughput Reduction", where the attacker sends spoofed invalid acknowledgments to the target connection's receiver, causing it to send duplicate acknowledgments to the sender, which will mislead the sender about the congestion status of the network.

"Optimistic Ack Attack" means that the attacker impersonates the receiver by sending acknowledgments for the data that has not actually been received, causing the sender insensitive to the actual congestion of the network and unfair with competing connections.

One abstract attack strategy means a path in the FSM that includes either a cycle or a terminating state and includes a transition that either increases or decreases cwnd, which further affects TCP's sending rate. So they used a modified depth-first traverse algorithm to enumerate all the paths in the FSM and generate all the abstract strategies.

Concrete actions

After the abstract strategies are obtained, the concrete actions to achieve them are created by mapping with domain expert knowledge.

The set of basic actions can be classified into two categories: (1) injection of acknowledgments; and (2) modification of acknowledgments. In the first category, three basic actions are designed with detailed parameters: (1) duplicate acknowledgments; (2) offset acknowledgments (acknowledge less or more than the actual data); and (3) Incrementing acknowledgments(inject a series of acknowledgments where the acknowledgment number increase by a variable). In the second category, 4 basic actions are designed: (1) Divison; (2)Duplication; (3) Pre-acknowledgment; (4) Limiting.

TCPWN: the automated platform

This work designed an automated platform called TCPWN, which includes 4 modules: the State Tracker to track which control state the current connection is in; the abstract strategy generator; the concrete strategy generator; and the attack injector.

As to tracking state, since the authors hope to get a model that can suit several implementations with subtle differences, they consider the trade-off between precision and generalization and choose to approximate by focusing on the core states. Besides, they assumed a bulk transfer application that always has available data to send.

The authors did experiments with 5 implementations of TCP in 5 operating systems. The testing environment consists of two clients and two servers, all of which are Linux machines and there is a single bottleneck link between them to provide an environment where two flows compete for bandwidth.

Two metrics are used to detect the attack's success: the time taken to transmit and the amount of data transferred in the connection at the TCP level.

Experiment findings

Over all the experiments done, 6585 strategies are tested and 2436 attacks are found, which are classified into 11 classes, and 8 of them are unknown in the literature: including (1)On-path repeated slow start; (2) Amplified Bursts; (3) Ack Lost Data; (4) Slow Injected Acks; (5) Sawtooth Ack; (6) Dup Ack Injection; (7) Ack Amplification; and (8) off-path repeated slow start. Some of the aforementioned attacks hold weird logic which might be difficult for people to manually find.

Automata-Based Automated Detection of State Machine Bugs in Protocol Implementations

Introduction

The paper is published in NDSS 2023, and it focuses on how to automatically detect state machine bugs in implementations of stateful network protocols, which is severe since any deviations from the standard specifications may introduce exploitable vulnerabilities, like bypassing authentication or establishing insecure connections.

They proposed a black-box technique for detecting state-machine bugs in implementations of stateful network protocols which is fully automated with the input model of the implementation (which can be derived using model learning) and a catalog of state-machine bugs for the protocol (to produce that a systematic approach is introduced later) in the form of DFAs.

As to bug detection, the work adapts existing techniques from model checking and combines the model with a bug pattern to obtain a set of candidate requirement-violating inputs. To validate the bug, they apply the candidate inputs until an actual requirement violation is observed or a threshold is reached to compensate for the inaccurate model effects. And the experiments with several complex protocol implementations also validate that their technique can find all existing bugs from two previous works very fast, as well as find new bugs.

Background

Model learning and model checking

State fuzzing is an approach to automatically infer state machine descriptions of protocol implementations. It uses a black-box technique called model learning, which observes how the implementation deals with the input test cases to produce state machine models that describe how the implementation handle message flows. When the state machine is obtained, researchers can further detect the flaws inside the control logic.

The approach in their work combines techniques from both model learning and model checking.

Most model learning algorithms produce a deterministic Mealy machine as a model by operating in two alternating phases: hypothesis construction and hypothesis validation. And the implementation of model learning typically needs to know the abstraction of how to map concrete protocol packets into alphabet symbols and vice versa.

In the hypothesis construction phase, sequences of input symbols are sent to the SUT, and the sequences of output symbols are observed. When certain convergence criteria are met, then a hypothesis is constructed by the learning algorithm. For input sequences that are not seen before, the hypothesis will predict outputs by extrapolating from the observed outputs.

Now to verify whether the predictions are right, the Hypothesis Validation phase is switched into. During validation, the SUT is subjected to a conformance testing algorithm. If a counterexample is found, the hypothesis construction phase is repeated to build a refined hypothesis. If no counterexample is found, then the current hypothesis is returned as the inferred model of the SUT.

Besides, if the loop of hypothesis construction and validation does not terminate, it indicates that the utilized learning algorithm cannot abstract the SUT into a deterministic Mealy machine, which perhaps is because its size and complexity exceed the algorithm's ability. But still, the last constructed hypothesis can be used as an approximate model of the SUT.

One related work is model-based testing. In model-based testing, the model typically is manually supplied and reflects the desired (instead of actual) behaviors of the system under test (SUT). However, this work regards models as reflecting the actual behaviors of the SUT and provides requirement-violating input to compensate for the inaccuracies in the learned model.

Previous work on detecting bugs

Previous work manually analyzes to detect bugs, like using expert knowledge to do pairwise comparisons between models of different implementations. DIKEUE generates input sequences that can cause two models to act differently, but to tell if the difference is an actual bug or benign or false positive, manual classification is needed. The manual analysis has many shortcomings: it requires effort and expertise, but could still miss some bugs. Also, the learned model can be big with hundreds of states and if the bugs are fixed, the newly learned model may differ a lot and should be manually analyzed again. It's also hard to produce concrete witnesses of the bugs through only ocular investigation.

Previous work uses the expression type linear temporal logic (LTL) and model checking against the learned models. However, the information that can be held within LTL expressions is more limited compared with being expressed via automatons.

Method

Overview

To summarize, the work (1) proposes a new automated black-box technique for detecting vulnerabilities in stateful protocol implementations, with the model and a set of protocol state machine bugs encoded as automata; (2) tested their technique against 3 implementations of SSH servers and 9 DTLS server and client implementations, and produced previously unknown or missed bugs; (3) proposed a method to reasonably assemble a set of bug patterns.

The work tested their approach on different implementations of SSH-2 and DTLS-1.2.

Bug specification by automata

A bug specification is a catalog of bug patterns while a bug pattern is typically a small automaton that specifies a particular error in the logic of an implementation, like sending two messages in the wrong order. To get a complete set of bug patterns, researchers can assemble them from requirements in RFC, or use previously reported bugs.

Nevertheless, another way is to provide a typically not small automaton that encodes a set of allowed sequences of exchanged messages and reports whenever the observed sequence deviates. This work combines the above two kinds of methods together.

The work contributes by designing ways to automatically verify the model against the specifications of allowed message orderings, encoded by finite automata. Each automaton observes the sequence of messages exchanged during a protocol interaction and reports when the sequence violates the requirement encoded inside the automata.

SSH protocol

The SSH protocol is a client-server protocol to enable accessing network services securely over an untrusted network. A typical SSH session holds three phases, corresponding to three layers:(1) the Transport Layer which uses a symmetric key to establish secure communication; (2) the Authentication Layer which enables the client to authenticate with the server; (3) the Connection Layer which transports the application level services.

The three phases are (1)Key establishment; (2) Authentication; (3) Connection. During Key Establishment, the client and server establish the keys to secure higher layers, and in the phase, a Diffie-Hellman Key exchange should be supported by all the implementations. During Authentication, all implementations must support authentication using public keys. If authenticated successfully, then during the Connection phase, various services provided by the server can be transported to the client via a channel. SSH allows multiple channels to operate simultaneously.

DTLS protocol

The DTLS(Datagram Transport Layer Security) protocol is an adaption of the cryptographic TLS protocol for use with Datagram-based transport protocol such as UDP, extending the original TLS by allowing for messages arriving out of order or fragmented. When a connection is established firstly, a secret is negotiated by Handshake Protocol, which later is used to encrypt messages symmetrically by the Record Protocol. During the handshake establishment, a certificate may be required and the cipher suite choice may be renegotiated.

Obtain the state machine and bug pattern catalog

The model of the SUT's state machine is mainly obtained by model learning tools. Sometimes if the learning cannot converge within a certain time, the generated model may be regarded as an approximate one and further a reduction may need to be done to get a readable Mealy Machine with a reasonable number of states.

A bug pattern is expressed in form of an automaton, which means that the automaton will accept the sequences of messages that expose the vulnerability. For example, if the server requests a certificate, and then sends a ChangeCipherSpec(CCSc) message to enter the handshake finish, it exposes a bug and such a message sequence will cause the automaton to come into the terminating bug state. The paper describes several automata examples to explain how to encode a vulnerability into an automaton. But to construct all the DFAs, a quantity of manual work has to be done either by investigating the requirement files or from previous knowledge.

Nevertheless, the paper also talks about a systematic way to construct a set of bugs from empty (since for DTLS clients, no previous work exposes the bug pattern set) with a description of allowed protocol behavior. Also, they used the technique for DTLS and SSH servers to include other missed bug patterns.

Take the DTLS for example, apart from protocol-specific state machine bugs, two general regulation exists in the example: (1) a client must never send a message which deviates from the given sequence, and (2) on receiving an unexpected message from the server, the client should end the session either by ignoring subsequent server messages or responding to them by Alert.

In the example, the constructed DFA can be divided into three parts: the paths of allowed message sequences; the paths of deviating client messages, and the paths of deviating server messages. And the above-mentioned rules, if violated, can be detected by transitions from a state in allowed paths to a state in the deviating paths.

Further, to solve the problem of providing information about what requirement is violated, when a violation is found, they proposed to add specific bug patterns to the bug pattern catalog, and the specific ones are tried first before the more general ones.

Translate from Mealy machine to DFA and intersect

With the Mealy machine model and the bug pattern catalog DFAs, there is still a gap to figure out whether the model contains state machine bugs. The two expressions are different: Mealy machines specify input-output relations of the symbols in their transitions while DFAs define a language, i.e., the set of words it will accept.

The work proposes a technique to convert the Mealy machine model into a DFA and intersect with each of those bug pattern DFAs. If the intersection results in no state remaining, that means the implementation does not exhibit bug patterns. Otherwise, the bug-exposing test sequences are extracted and applied.

After the extraction, test cases are constructed to be applied to the SUT. Since the model might not faithfully express the behaviors of the implementation, they iteratively generate the sequences in the accepted language of the intersected DFA, until finding a bug or some threshold is reached. In detail, they use Breadth-first search (BFS) to generate sequences of increasing length and restrict the search by a suitable K which means that each state is visited at most K times.

Implementations and experiments

In the implementation part, for SSH servers, they reused and changed previous work on model learning and checking to do the mapping between message packets and abstract symbols. Also, the authors constructed a general bug pattern similar to what they did with DTLS clients, which enabled them to derive a new specific bug pattern, Missing SERVICE_REQUEST_USERAUTH. For DTLS implementations' learning and testing, they used DTLS-Fuzzer to do the translation. Variables in a state are used to fill in some message fields to parameterize some symbols, like the type of certificate type, the symmetric key for encryption or decryption, etc.

In the evaluation part, the paper mainly focuses on whether known bug sets can be detected and validated, whether new bugs can be found, how their technique performs on newer versions of the protocol implementations, and how well it performs on DTLS client implementations which have not been subjected to model learning.

Experiments on 3 SSH servers and 9 DTLS implementations (several versions of them) show that all the previously known state machine bugs can be automatically detected and validated.

Besides, their technique detected 5 previously unknown bug patterns in the case of the SSH server, and when they applied the bug patterns to the latest versions of SSH server implementations, the authors detected 4 previously unknown bugs. For the DTLS servers, the paper found 7 additional bugs that were missed by previous visual inspection, and for the DTLS clients, 2 new vulnerabilities and 64 previously unknown bugs were found.

Comparison and thoughts

All three papers look to address the general problem of finding vulnerabilities in protocol implementations but differ in the specific protocols they target, the intuitions of the attack, and the techniques they use to build state machines and detect vulnerabilities.

The first paper addresses the TLS implementation, the second paper focuses on the congestion control part of TCP, and the third paper validates their techniques on two protocols: SSH and DTLS.

Even in the state machine construction part, they differ a lot. The first paper uses existing model-learning tools and improved model-checking methods to build an approximate model. The second paper simply abstracts the state machine based on the classic TCP protocol without relying on automatic tools for state fuzzing. The third paper also uses different model learning tools to obtain state machines, based on which the authors mentioned an interesting observation: the learned state machine is so unique that it can be used as a method for fingerprinting.

As for finding the vulnerabilities and designing the attacks, the first paper simply manually analyzed the learned state machine, and started a deeper inspection after noticing spurious states, transitions, or paths; while the second paper designed the attacks inspired by some key side effects brought by certain transition actions; and the third paper encodes the bug patterns into automata and intersects with the DFA translated from Mealy machine(the learned model of one implementation) to test if concrete counterexamples can be found simultaneously.

Although the first paper mostly used existing tools and did only manual inspection, they still found many interesting vulnerabilities, which might inspire us that simple methods may work while complicated approaches may not necessarily reward more.

The second paper focuses on one part of a specific protocol and some key observations of side effects brought by certain state transitions are utilized to construct attacks. Even though the work with such a specific focus may not be directly reused in other ways, the mind flows behind it can still inspire other researchers to come up with similar solid work.

The most recent one of these 3, the last paper, seems to be way more complicated, which hopes to automatically detect vulnerabilities, and the key method is encoding bug patterns into automata and doing intersection. Also, the technique proposed in the paper is more generalized and hopefully can be applied on other protocol implementations, which credits to the general form(DFA) and the generalizable idea they propose.

In the end, the foreseeable future trend is to develop automated methods rather than manual ones, to try to detect more vulnerabilities that might be missed with previous approaches, and to facilitate testing and verification of more protocol implementations in a more general way.