How to Hack Engineers: An “Unknown Knowledge Attack”
How to Hack Engineers: An “Unknown Knowledge Attack”
This article aims to offer computer technologists a provocative, alternative point of view on underlying vulnerabilities through social engineering.
Join the DZone community and get the full member experience.Join For Free
Social, organizational and educational issues can create vulnerabilities that manifest in technical solutions years after their appearance. The strong focus on technical short-term solutions for such issues needs balancing. “Social engineering” focuses on the exploitation of symptoms without considering their source. If the underlying causes are not taken care of, the vulnerabilities will recur in changed patterns. The article aims to offer computer technologists a provocative, alternative point of view.
My Linux-dominated youth has directed my career for several years to Linux engineering. However, even though playing with my shadow’s hashes and bashing the biting python, rainbows have never been my “salt of life”. My studies comprised mainly social science-related disciplines and I have worked for many years in related fields. Therefore, my perspectives on many challenges may differ.
Common Vulnerabilities and Exposures (CVE) records discovered security vulnerabilities. If a CVE-recorded, solved vulnerability is used to attack a machine, its exploitation may be a technical process but its appearance is not: from a technical perspective, the vulnerability was identified, solved, and related information was published by common institutions. Administrators, engineers, and developers (or simply: professionals) often joke about the simplicity of conducting “social engineering” against users. However, the existence of deficiencies in social, organizational, and educational architectures makes professionals comparably vulnerable, even though on lower abstraction layers and anyway, administrators are themselves users of the developer’s product: “user” is a situational declaration.
Technical Trajectories and Reciprocal Anticipations: One ARP to Rule Them All
Today, communication on the Internet and local networks use mainly IP addresses. But at the lowest level, when a message from the Internet arrives on a local network, our hardware remains too dumb to work with IP’s. It has to fall back to a lower protocol: the Address Resolution Protocol (ARP).
Until today, the destination network of a message has to map the recipient’s IP address to its corresponding ARP address, its MAC address, to ensure that the message arrives at its destination. Of course, this is not a problem. Every operating system can manage this easy process. It seems that we do not need to be aware of it.
ARP and its MAC addresses have never been intended for huge networks as of today. They were designed in 1982 when no one was worrying about trustworthiness on that level. When a message enters a network, its gateway needs to look up the MAC address that corresponds to the destination IP. If not already known, the gateway has to send a broadcast into the local network to get the MAC address of the destination IP. Some machines will tell (the truth)! That is standardized.
However, someone can violate the rule. Under certain conditions, anyone in the network can pretend to be the recipient of the message by asserting to possess the recipient’s MAC. On the contrary, it can be possible to link the recipient’s IP to an attacker’s MAC. ARP’s standard does not consider such behavior. The original standard does not consider “lies”. In the worst (and today, mostly virtual) case, falsifying the MAC of another machine can make the local DHCP service allocate the destination IP, too; getting both the destination MAC and destination IP!
Generally, this problem was solved many years ago. Physical ports and log files of routers can distinctly identify the machines and their addresses, widely eliminating such invasive attempts: faking physical ports is not easy. Nevertheless, virtual interfaces have virtual ports.
I had a deeper look at the issue, testing virtual networks as included in contemporary virtual machine hypervisors. Testing twice with Ubuntu and Fedora Linux systems, I had a look at VMware Workstation, QEMU/kvm (using libvirt’s virtual network) and VirtualBox in (distribution) default configuration - three widespread hypervisors while especially libvirt’s virtual network is used in further virtualization solutions, too.
Indeed, when I already controlled another virtual machine in the given local network (more precisely, the destination subnet), the factory configurations of VMware and QEMU/kvm/libvirt (installed using default repositories of Fedora/Ubuntu) enabled me to take over MAC addresses. QEMU forwarded incoming connection attempts to the last machine that claimed the destination MAC address but did not intervene in existing, established connections. VMware on the other hand interrupted existing, established connections, and new connections were, as with QEMU, forwarded to the last machine that claimed the destination MAC address.
Further, QEMU allocated the IP address that belonged to the recipient’s MAC address while VMware did not. Nevertheless, I could eavesdrop the whole traffic of the recipient’s MAC address in VMware, even without regularly resending faked packets. The latter was not possible on QEMU. The VMware behavior is less likely to grab the attention of administrators when exploited. Of course, simply listening to the recipient’s connection without taking over does not make a difference if the connection is encrypted.
Still, the encrypted connection will be established with the falsifying machine if it took over the recipient’s address. This has been no realistic scenario on VMware as the original recipient would have destroyed the connection with its next action in the network, contrary to QEMU. Either way, the original recipients of messages provide interesting opportunities when going offline. Yet, most people do not know who is in their subnet when getting a cloud-based virtual machine, or how its hypervisor protects against such attacks.
To avoid misunderstandings, all these issues can be solved – even on virtual machines! ARP’s behavior is known, and there are plenty of solutions like "nwfilter" for hypervisors, solving the issue.
Yet, this article focuses on intruding using social and organizational vulnerabilities, not technical ones. Many professionals in the industry have never heard about ARP. Many administration of virtualization certifications does not include lessons about that topic. Usually, it is not necessary to handle any ARP requests. The problem is not the availability of solutions, but the awareness of related professionals: they have to know about the issue to prevent it, and not to roll out the software that enables known issues to become critical.
Concluding the Technical Part with the Allure of “gets” and Encoding Schemes that Prevent You from Losing Your Password
By standard, the C programming language contains a dangerous function facilitating buffer overflows. In some cases, this results in a segmentation fault: the application crashes. However, in the worst case, an attacker can inject any command into a system. Yet, solving this vulnerability is very easy: simply replace "gets" with "fgets" when coding. Now, the program will do length checks to prevent overflows. However, even though it can be solved that easily, until today, well-known applications get hacked through issues like "gets" because professionals do not know about it or simply neglect the danger.
This code is very vulnerable but still not obsolete: every contained weakness keeps recurring. It implements the possibility of buffer overflows as “strcpy” has the same problem as “gets”.
Further, the code assumes the encoding scheme “Base64” to be encryption. However, translating an English text into German does not imply encryption. Even if some do not speak German: put it into google translator and it will tell about the meaning. Encoding is not different, and Base64 is easy to use/decode, too.
Many professionals still use Base64 as "encryption" without questioning the encryption process that does not ask for any type of secret or a cryptographic certificate. A 2019 study of the University of Bonn, in which 43 participants were assigned the task to develop password storage without knowing that it was a study revealed:
- 1 participant frankly supposed it was a study, but still used SHA-1 without iteration, key derivation function or even salting
- 8 participants used Base64 for the "encryption" of keys.
- 10 participants used MD5: one MD5 implementation used salting, and no one used a key derivation function or iterations.
Four of the Base64 developers and five of the MD5 developers were explicitly prompted to develop a secure solution. The study analyzed a homogenous sociotechnical environment with a limited number of participants, being of indicative character, but it details widespread experiences. Unfortunately, the problem’s widespread appearance makes attackers know when to check for special “algorithms”.
The “unknown knowledge attack”
In 2019, wolfSSL, a widespread SSL/TLS implementation, was discovered to be vulnerable to a buffer overflow (CVE-2019-11873) in the PSK extension with the possibility of remote code execution. Length checks would have prevented this issue.
Most vulnerabilities that can be exploited on running production systems are not technical attacks on machines. All these are social or organizational attacks on the machine’s professionals: developers, administrators, engineers. Known attacks, that could have been easily prevented, are enabled through professionals, which did not know about the dangers or expected anticipatory conduct (and therefore, knowledge) in subsequent agencies. Therefore, the problems among professionals equal the problems between professionals and users: wrongful reciprocal anticipations of each other.
The absence of “applicable knowledge” is the general origin: software development, engineering, and administration offer a response. Products behave and work as intended, or not. If it does not work as intended, the necessary knowledge has to be achieved and applied by gathering and analyzing corresponding information until the measurable issue is solved. However, IT security, or cybersecurity, does not offer an explicit response. If it was not yet hacked, it seems to be (implicitly) secure. Information is available but professionals are not aware when they should gather and analyze it. Therefore, this information does not get processed to become (applicable) knowledge. Since they are social or organizational by nature, such issues cannot be solved through pure technical solutions.
VirtualBox was by default configured to resist the ARP-related attacks. Nevertheless, this does not imply VirtualBox to comprise the all-encompassing perfect paradigm. It is mainly intended for less complex use cases. Simply comparing it to VMware or QEMU would be unfair. Still, it would be worth analyzing which security functions could be enabled by default on other hypervisors, too: even the need for deactivating a function creates a basic type of awareness.
A more restrictive factory configuration in QEMU would be more secure. However, it would then be more complicated to set up complex solutions, if such functions need to be adjusted or even deactivated (the latter is too often done without hesitation). Anyway, a professional then has to get deeper into the topic to achieve the aim. If a higher level of expertise is needed or more time has to be invested, this will lead to more expenses. Thus, using QEMU would be less competitive. Therefore, another question arises: how competitive is real security?
When Customers Demand IT Security (Promises), Or: The Users’ Share of the Problem
Vulnerabilities of RC4-secured environments, and theoretical indications of vulnerabilities in the algorithm, had been known for long. Still, it remained in widespread use, e.g. in Online Banking, until the aftermath of the Snowden documents. RC4’s practical security was known to depend on many external conditions that could create vulnerabilities in combination (e.g. weak initialization vectors in WEP). Secure and less fragile block ciphers had existed for years and at least for HTTPS Online Banking, there was then no noteworthy reason other than costs for companies to not migrate to a block cipher.
When arguing about the topic, I have asked people about the encryption of their online banking. Most did not know about it, even importing unverified certificates when asked for. Most customers of financial institutes ask for features and prices, and this is what they know about. Therefore, investments in strong cryptography and sophisticated security audits are not crucial. Rather, it increases the price and, in some cases, it can limit features the customer asked for.
Again, the problem is tied to the measurement: performance, features, and prices can be measured explicitly. Based upon the average knowledge, customers’ measure IT security implicitly: if there is a promise without refutation, it is secure. Nevertheless, it remains unknown to the customer whether it is truly secure or simply not evaluated appropriately. The customer has a significant share in distorting the competition. Mentioning the importance of IT security does not make a difference if the customer manipulates the competition to the contrary: if the customer emphasizes only the best price/features, competing companies will defect and unconditionally adjust to the demand (which then implies a security promise, not security), at the latest when it comes to survival. Being prepared for tomorrow’s potential hacks does not make a difference if the company then ceased to exist. Further, secure pre-configurations of services can be interfering with the user experience.
The Educational Share of the … Solution or Problem?
Many universities have started to offer IT security-focused courses and degrees, next to those focusing on software development or other disciplines. While the first have to know about buffer overflows and the intention of Base64, the latter gets an A+ when using “gets” and “strcpy” in the programming language C if the main task’s objective is fulfilled. The reason is simple: if a course is not IT security-related, it often does not incorporate IT security objectives. From an educational perspective, students often get rewarded for neglecting IT security. Nevertheless, IT security specialists often lack knowledge about competitive, complex software development processes.
When I was about 14 or 15, I once lost the administrator password for a Fedora Core installation, which I used for testing purposes. It was easy to find out that passwords are stored in a file called “shadow”. At that time, I just started to engage deeply in such topics and did not know much about hashing. Therefore, what I have seen was not my password but strange lines that seemed encoded in some way. I got into the topic and found out about hashing and salting for secure password storage. So, if I think of storing passwords today, my mind relates the topic to hashes and salts before anything else. It strongly matters how I learned about it.
On the other hand, if a student had the objective of code password storage in a software development course during the degree studies, that student will not gain such a mental relation. If IT security is no objective in this course, the student will do it the easiest and most (measurable) efficient way: maybe plain text, or Base64 – easy to implement and very good in performance terms. Just like a student who finds out that “gets” performs better than “fgets” in C: it does not use system resources for length checks. If the tutor does not demonstrate the implications to the students, they might not know when and where to ask themselves security-related questions. Further, there is no reason to invest even more time into dedicated security research if there is no reward. Rewarding, even if indirectly, can be the problem’s origin in both degree studies and market competition.
If the tutor (or the customer) does not emphasize the solution’s security, Base64 is easy and cheap to implement and enables good performance. The student will possibly not be able to recall that another tutor mentioned related security implications in a different lecture: due to the different lecture and task objectives, the student will perceive different contexts, not intuitively relating them. Later, when working as a software developer, the former student links respective tasks only to software development lectures without a focus on security. A teacher would maybe argue, it is for teaching purposes only. However, IT security has to be an integral part of software development and IT at all. Referring to my shadow’s hashes, the human mind is an event-driven relational database, and it should be “updated” correspondingly. If IT security and software development are experienced as two events, the former will not be part of the latter. Therefore, “code as you train, train as you code”.
The freelancer study’s introduction contains a summary of its preceding study, focusing on students (including MSc students) as participants. 20 students were not explicitly asked to implement a secure solution for a university social network registration process and therefore, did not try. Further 20 students were asked to implement it securely of whom only 12 did it to some degree. However, every solution that was determined secure was verifiably copy-pasted: no student coded a secure solution.
Maybe, it is important to ask about the objectives of the participating students’ preceding software development tasks. When assigned the new tasks, they experienced events that contained problems: to which preceding events did the students relate the new one? Which problems/questions did they derive from the predecessor? The different behavior of the two groups of students indicates that the availability of (some level of) security knowledge does not imply “applicable knowledge”: students only applied the knowledge of those preceding events, which correlated to the new one.
Getting Rid of IT Security … to Increase IT Security
IT security degrees do not solve the problem of developers without security awareness. Likewise, security-focused black and white box testings are additions, no replacements: both are less likely to find vulnerabilities than security-aware software development is likely to avoid these vulnerabilities in advance. The average automated test, even if security-focused, might not recognize Base64 “pseudo-cryptographic encodings” or MD5 hashes if the algorithms themselves were properly implemented/integrated: Base64 offers plenty (non-cryptographic) use cases, which could be the reason for its application. Related security issues are more likely to be found by human penetration testers.
However, involving non-automated human work is costly (indirectly increasing product prices) and unreliable: it is not certain whether they will look into the very folder, where the MD5 hashes or Base64 encodings are stored, and then identify them as such. They will need a reason to analyze the folder's content. This is strongly dependent on their given scope, their experience, and due to the complexity of contemporary systems, this also needs a bit of luck. Independent penetration testing and red-teaming are valuable additions to increase security and stability. In particular, they are likely to identify errors in reasoning, often embedded in technical or organizational architectures and therefore, invisible to people within. Still, they do not solve any security problem on their own.
Some certificates like the Linux Foundation Certified System Administrator or Certified Engineer already incorporate IT security without making it a separate certificate pathway. This needs normalization throughout IT training of any type of professional. Even if preinstalled, “disabled” remains a common state of SELinux on many systems as administrators are not familiar with managing it.
In larger companies with dedicated/professionalized (back office) IT, integrated security has often been normalized on a high level and recent graduates are led to best practices by their professionalized environment. However, SME’s are unable to dedicate and professionalize these resources. They can be overstrained in the implicit evaluation of security or security qualifications of respective service providers. For them, the all-encompassing choice is computer science graduate from a university. Graduates have to integrate and normalize IT security in their education instead of separating it, avoiding at least critical “issues by design” in the environment they are specializing in.
IT-related SME’s are no exception. A (graduated/certified) hypothetical PHP-developing CTO has possibly never heard of Argon2 and SQL injections when setting up a CMS with a user portal on the Internet, being unaware of his inability to do it himself and simultaneously, unable to identify professional solution providers on the market. (CMS service) (out)sourcing is a project that needs qualified management: basic knowledge of critical issues and dependencies to identify professional (internal/external) providers and to consider the project’s outcome in future projects. Professional services include security by definition. Dedicated IT security is a symptom of the problem, not its solution. However, if professionals know about their security context, the division of labor can be part of the solution.
If average users are seen as professionals, for example, professionals in using office suites, these concepts can widely be transferred: the average spreadsheet/word processor certification/education does not include a (10 minute) lesson about the doc, docx and “macro + unknown origin = dangerous”. Complementary, how many secondary schools that introduced IT lessons integrate such topics when teaching spreadsheets? Getting rid of separated IT security equals getting rid of separated users (separated from professionals).
Holistic Approaches for Holistic Solutions
Some years ago, Bruce Schneier noted that it is harder for the NSA to backdoor TLS than BitLocker. Even though his blog post is not directly related, its concept could be complemented by arguing that it is often harder to backdoor software than to exploit a professional’s applicable knowledge.
We still uncover unforeseen behavior of systems that had never been intended. Some result in crashes of applications or even operating systems. Other behaviors result in security issues that offer attackers in the most critical cases of administrator privileges. This will not change. But many IT security issues are not of this kind. Many issues are not about hacking a system with unintended behavior. It is about hacking people who behave unintended: developers, engineers, administrators (and office suite users/professionals). Finally, “professional” and “user” imply situational declarations, only divided by having the needed “applicable knowledge” or not. If a penetration tester finds a type of vulnerability as it was known before, the question is not how to solve the issue but why it was there. The distinct evidence for that IT security is far from being normalized is the term itself: calling it “IT security” proves that it is a separate field from development and administration.
IT security needs holistic and all-encompassing approaches. It contains social and technical aspects with implications reaching far before the development of current solutions. We need to emphasize the people, not the solution. People’s behavior is always the result of related, experienced (past) events. This includes professionals and users, tutors, and students: their massive interaction through and with technology consolidates holistic agency - and this agency needs to be considered (and manipulated) holistically, too. IT security is a potential outcome of this agency and people have to achieve it: just like success, it is nothing they can buy off the shelf.
Opinions expressed by DZone contributors are their own.