DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Enhancing Avro With Semantic Metadata Using Logical Types
  • Secure File Transfer as a Critical Component for AI Success
  • How Data Test Engineers Ensure Compliance and Security With Automation
  • Controlling Access to Google BigQuery Data

Trending

  • Unlocking AI Coding Assistants Part 1: Real-World Use Cases
  • Develop a Reverse Proxy With Caching in Go
  • It’s Not About Control — It’s About Collaboration Between Architecture and Security
  • Apache Doris vs Elasticsearch: An In-Depth Comparative Analysis
  1. DZone
  2. Software Design and Architecture
  3. Security
  4. Packers, How They Work, Featuring UPX

Packers, How They Work, Featuring UPX

A security researcher demonstrates how to get packers, specifically UPX, up and running on our system. Read on to learn more!

By 
Christopher Lamb user avatar
Christopher Lamb
·
Sep. 30, 17 · Tutorial
Likes (1)
Comment
Save
Tweet
Share
21.0K Views

Join the DZone community and get the full member experience.

Join For Free

This article is featured in the new DZone Guide to Proactive Security. Get your free copy for more insightful articles, industry statistics, and more!

UPX, the Ultimate Packer for Executables, is a software packer. A really good one, in fact, and it’s currently under active development to boot. The source code for it is available, and it’s available on a variety of operating systems. It compresses executables for a variety of operating systems too.

As the name suggests, UPX packs executables. And by “pack,” I mean it compresses and compartmentalizes programs. It will take an executable, compress it, and pack the compressed code into another section of the executable. At runtime, it will uncompress the previously compressed code and execute it.

Along the way, it happens to obfuscate the intent and actual code of the program.

Don’t get me wrong here—UPX is a remarkable feat of engineering, and I don’t begrudge the use of the program. It has real applications in areas where storage space or bandwidth is at a premium. And honestly, it’s not hard to detect or to work around when used, and most anti-malware solutions don’t have a problem detecting it today. But the idea behind packers in general, and UPX specifically, is very useful to malware authors.

To show how these tools are used to hide malware payloads, I’m going to show you how packers work, using UPX, and how you can detect them. Now I’m using MacOS, but you can follow along with Linux if you’d like. Most of the points I make will carry over. UPX is available on Windows

Installing UPX. UPX is a breeze to install. You can compile from source if you’d like, but I didn’t, so I used Homebrew. You could use MacPorts too, or if you’re on Linux, it’s likely available via the package manager of your choice. To install via Homebrew, typing brew install upx from the command line, should work just fine.

Creating a test program. Now that you have UPX installed, we’re going to create two different versions of /bin/ls. One packed, one not, and we’ll start to compare the two. I suggest you create a directory you can start to work in. I called mine upx-test, but you go with whatever name works for you. Copy /bin/ls into the directory.

Now, let’s create a packed version: upx ls -o ls-upx should do the trick. Now, just so we are able to maintain both versions, copy ls to ls-orig. Just for fun, compare the output from running ./ls-orig to ./ls-upx. Notice they look exactly the same. Just for fun, compare the hashes of  ./ls,  ./ls-orig, and  ./ls-upx. You’ll notice that ./ls and ./ls-orig have the same hash, but that the hash of ./ls-upx is different even after executing ./ls-upx. This is interesting in that it implies that the UPX packed executable extracts the packed code, runs it, and terminates its process without serializing anything to disk. The program representation on disk never changes.

Looking Through the Binaries

See upx.github.io for more information, source code, and documentation.

Now, let’s start to look through the binaries. We’ll use Jonathan Levin’s fabulous JTool for this. Sorry, there’s not an equivalent on Linux, but later I’ll use Hopper for some reverse engineering. You can see similar things via the Hopper interface, and Hopper runs on both Linux and MacOS.

Anyway, if you’re on a Mac, install JTool (brew install jtool should work. I believe MacPorts has this too). If you’re on Linux, you’ll need to use gnu binutils for this.

Anyhow, let’s take a look at the program formats, shall we?

Image title

Figure 1: Segments and sections of the original ls command

In figure 1, we see the output from jtool-l over the original ls executable. This has a large amount of information, including the program entry point and OS versions. The text segment is fairly large as well and contains constant values, strings, and other information. Now, let’s take a look at the version we just packed:

Image title

Figure 2: Segments and sections from a UPX-packed ls command

Well, it looks like we have a little less information here, and most of the program consists of the text segment too. So what are these sections?

Well, LC_SEGMENT_64 is a standard 64-bit segment to be loaded into an address space (this is MacOS specific). LC_VERSION_MIN_MACOSX lets us know the minimum OS version for this executable, and LC_UNIXTHREAD gives us information about how the program is to be started,
including the program entry point. __TEXT contains program code usually, while __LINKEDIT contains information used by the linker—things like strings, function names, and so on.
__PAGEZERO is a chunk of memory allocated and protected to throw exceptions on access. This helps protect the program against null pointer or integer dereference errors in code.

It seems that UPX pretty radically changes the program. Let’s look a little deeper.

Reverse Engineering

Now, we’re going to start looking into the program. We can do this from the command line using tools like JTool or using high-end disassemblers like IDA. In this case, we’ll use a reverse engineering tool called Hopper. Hopper is a fully-featured reverse engineering tool, and it offers a free trial to boot. If you’re following along, go ahead and download it. I’ll wait.

Ok, great—let’s get started.

Open ./ls-upx and ./ls-orig in Hopper, and take a look at the left pane. Select the Proc. tab. The first thing you see is that there are very few functions defined in  ./ls-upx when compared to ./ls-orig.

You can download Hopper from hopperapp.com. It’s something like an order of magnitude less expensive than IDA Pro, so if it fits your needs, I highly recommend it. It doesn’t support all the processor types IDA does, nor does it run on Windows, but it works very well for MacOS reverse engineering.

Image title

Figure 3 : UPX makes executables not-very-functional!

Messaging

Not only are there very few functions defined, they are all local. None of these are imported from external libraries (go to Navigate > Imported Symbols to see what I mean). And if you click on the str tab, you’ll see that the UPX version has no strings defined in the program, while the non-UPX version has plenty.

Now, select the pseudo-code button at the top of the display area:

Image title

Figure 4: Pseudo-code mode activated!

If we do this for both the  ./ls-orig and ./ls-upx, you’ll see something interesting. First, the original ls command has a large initial entry point, while the UPX’d version doesn’t. The original makes system calls fairly immediately (i.e. getenv(.), ioctl(.), and so on). The UPX’d version is self-contained (we’d expect this, as it doesn’t import anything).

Image title

Figure 5: Decompiled entry point from UPX’d ls

Interesting things here! Look at the do/while loop in the center of the function; here, we’re traversing the program __TEXT segment from the top down, until we get to a non-null value. Then, we have two additional function calls, and we exit (notice that once we get to the call to sub_f0000f7e(.) we have no more jumps—although, honestly, the called functions could cheat and jump, but we’ll take a look at that shortly).

Of these functions, sub_f00008fd(.) is the most interesting. If we look into that function (which does not decompile well, to be honest), we can see that this is the primary function dispatcher for the UPX code. If you look at the assembly block structure of the function (press the button just to the left of the pseudo-code button in Hopper), you can see that it’s much more complex than the EntryPoint(.) function, and it delegates to many more functions as well.

You’ll also notice that it returns a function pointer which the program then calls (e.g. rax = (r15)();). Guess what this is? This is the original program entry point after the code has been decompressed and is ready for execution.

UPX does lots of interesting things that we’re not covering, like loading all of the dependent libraries, for example—important on MacOS, as static linking isn’t supported these days. But this gives you an idea of how these packers work. UPX doesn’t encrypt either, which is certainly something you could do, though it shifts a bit into malware specific functionality. But at the end of the day, you can pretty clearly tell if a file’s been packed. It has no (or very few) strings. It has no (or very few) library dependencies. And the defined functions in the executable is very small.

With UPX, there are other clues too—if you look through the binary, you’ll find sequences of ASCII codes that spell “UPX!” used as tokens. You can see one of them just prior to where the booting function stopped scrolling through the __TEXT segment, for example.

Packers aren’t magic, but they are well-designed pieces of software engineering. Now you know how they work, and who knows, maybe you’ll use UPX someday too.

This article is featured in the new DZone Guide to Proactive Security. Get your free copy for more insightful articles, industry statistics, and more!

Data security Defensive programming

Opinions expressed by DZone contributors are their own.

Related

  • Enhancing Avro With Semantic Metadata Using Logical Types
  • Secure File Transfer as a Critical Component for AI Success
  • How Data Test Engineers Ensure Compliance and Security With Automation
  • Controlling Access to Google BigQuery Data

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!