Next-Gen Lie Detector: Stack Selection

Follow the steps of the creation of a lie detector's backend stack and learn the importance of remaining open to unconventional solutions for common tasks.

Grigorii Novikov

Jul. 05, 24 · Tutorial

Likes (55)

Comment

Save

58.5K Views

The first lie detector which relied on eye movement appeared in 2014. The Converus team together with Dr. John C. Kircher, Dr. David C. Raskin, and Dr. Anne Cook launched EyeDetect — a brand-new solution to detect deception quickly and accurately. This event became a turning point in the polygraph industry.

In 2021, we finished working on a contactless lie detection technology based on eye-tracking and presented it at the International Scientific and Practical Conference. As I was part of the developers’ team, in this article, I would like to share some insights into how we worked on the creation of the new system, particularly how we chose our backend stack.

What Is a Contactless Lie Detector and How Does It Work?

We created a multifunctional hardware and software system for contactless lie detection. This is how it works: the system tracks a person's psychophysiological reactions by monitoring eye movements and pupil dynamics and automatically calculates the final test results.

Its software consists of 3 applications.

Administrator application: Allows the creation of tests and the administration of processes
Operator application: Enables scheduling test dates and times, assigning tests, and monitoring the testing process
Respondent application: Allows users to take tests using a special code

On the computer screen, along with simultaneous audio (either synthesized or pre-recorded by a specialist), the respondent is given instructions on how to take the test. This is followed by written true/false statements based on developed testing methodologies.

The respondent reads each statement and presses the "true" or "false" key according to their assessment of the statement's relevance. After half a second, the computer displays the next statement.

Then, the lie-detector measures response time and error frequency, extracts characteristics from recordings of eye position and pupil size, and calculates the significance of the statement or the "probability of deception."

To make it more visual here is a comparison of the traditionally used polygraph and lie-detector.

Criteria	Classic Polygraph	Contactless Lie Detector
Working Principle	Registers changes in GSR, cardiovascular, and respiratory activity to measure emotional arousal	Registers involuntary changes in eye movements and pupil diameter to measure cognitive effort
Duration	Tests take from 1.5 to 5 hours, depending on the type of examination	Tests take from 15 to 40 minutes
Report Time	From 5 minutes to several hours; written reports can take several days	Test results and reports in less than 5 minutes automatically
Accuracy	Screening test: 85% Investigation: 89%	Screening test: 86-90% Investigation: 89%
Sensor contact	Sensors are placed on the body, some of which cause discomfort, particularly the two pneumatic tubes around the chest and the blood pressure cuff	No sensors are attached to the person
Objectivity	Specialists interpret changes in responses. The specialist can influence the result. Manual evaluation of polygraphs requires training and is a potential source of errors.	Automated testing process ensuring maximum reliability and objectivity. AI evaluates responses and generates a report.
Training	Specialists undergo 2 to 10 weeks of training. Regular advanced training courses	Standard operator training takes less than 4 hours; administrator training for creating tests takes 8 hours. Remote training with a qualification exam.

As you can see, our lie detector made the process more comfortable and convenient compared to traditional lie detectors. First of all, the tests take less time, from 15 to 40 minutes. Besides, one can get the results almost immediately. They are generated automatically within minutes. Another advantage is that there are no physically attached sensors which can be even more uncomfortable in an already stressful environment. Operator training is also less time-consuming. Most importantly, the results' credibility is still very high.

Backend Stack Choice

Our team had experience with Python and asyncio. Previously, we developed projects using Tornado. But at that time FastAPI was gaining popularity, so this time we decided to use Python with FastAPI and SQLAlchemy (with asynchronous support).

To complement our choice of a popular backend stack, we decided to host our infrastructure on virtual machines using Docker.

Avoiding Celery

Given the nature of our lie detector, several mathematical operations require time to complete, making real-time execution during HTTP requests impractical. We developed multiple background tasks. Although Celery is a popular framework for such tasks, we opted to implement our own task manager.

This decision stemmed from our use of CI/CD, where we restart various services independently. Sometimes, services could lose connection with Redis during these restarts. Our custom task manager, extending the base aioredis library, ensures reconnection if a connection is lost.

Background Tasks Architecture

At the project's outset, we had a few background tasks, which increased as functionality expanded. Some tasks were interdependent, requiring sequential execution. Initially, we used a queue manager where each task, upon completion, would trigger the next task via a message queue. However, asynchronous execution could lead to data issues due to varying execution speeds of related tasks.

We then replaced this with a task manager that uses gRPC to call related tasks, ensuring execution order and resolving data dependency issues between tasks.

Logging

We couldn't use popular bug-tracking systems like Sentry for a few reasons. First, we didn’t want to use any third-party services managed and deployed outside of our infrastructure, so we were limited to using a self-hosted Sentry. At that time, we only had one dedicated server divided into multiple virtual servers, and there weren't enough resources for Sentry. Additionally, we needed to store not only bugs but also all information about requests and responses, which required the use of Elastic.

Thus, we chose to store logs in Elasticsearch. However, memory leak issues led us to switch to Prometheus and Typesense. Maintaining backward compatibility between Elasticsearch and Typesense was a priority for us, as we were still determining if the new setup would meet our needs. This decision worked quite well, and we saw improvements in resource usage. The main reason for switching from Elastic to Typesense was resource usage. Elastic often requires a huge amount of memory, which is never sufficient. This is a common problem discussed in various forums, such as this one. Since Typesense is developed in C, it requires considerably fewer resources.

Full-Text Search (FTS)

Using PostgreSQL as our main database, we needed an efficient FTS mechanism. Based on previous experience, PostgreSQL's built-in ts_query and ts_vector could have performed better with Cyrillic text. Thus, we decided to synchronize PostgreSQL with Elasticsearch. While not the fastest solution, it provided enough speed and flexibility for our needs.

PDF Report Generation

As you may know, generating PDFs in Python can be quite complicated. This issue is rather common — the main challenge here is that to generate a PDF in Python you need to create an HTML file and only then convert it to PDF, similar to how it's done in other languages.

This conversion process can sometimes produce unpredictable artifacts that are difficult to debug. Meanwhile, generating PDFs with JavaScript is much easier. We used Puppeteer to create an HTML page and then save it as a PDF just as we would in a browser, avoiding these problems altogether.

To Conclude

In conclusion, I would like to stress that this project turned out to be demanding in terms of choosing the right solutions but at the same time, it was more than rewarding. We received numerous unconventional customer requests that often questioned standard rules and best practices.

The most exciting part of the journey was implementing mathematical models developed by another team into the backend architecture and designing a database architecture to handle a vast amount of unique data. It made me realize once again that popular technologies and tools are not always the best option for every case. We always need to explore different methodologies and remain open to unconventional solutions for common tasks.

HTML PDF Testing Elasticsearch Solution stack

Opinions expressed by DZone contributors are their own.

Related

Trending