Method Internals in Swift 5.0
In this post, we continue to get a better understanding of the internal workings of Swift by examining how methods work in this language.
Join the DZone community and get the full member experience.
Join For FreeOne of the nice things about Swift 5 is the final stabilization of the ABI. This is actually a big deal. The application binary interface defines exactly how data is stored in programs, shared from libraries, things like that. It includes name decoration, class and object definitions, and so on. Now that we have a stable ABI, building tools that analyzed and manipulate these binary representations will become much, well, not easier, but not as much a waste of time. Until now, you were just about guaranteed to have any tools you created broken by new Swift versions. With a stable ABI? This shouldn't happen.
We just covered how classes are defined in Swift 5, and we discovered that they reflect the basic design in Objective-C. There are some key differences though, and one of those is member method definitions.
In Objective-C, you might remember that methods defined in a data pointer are stored in the class definition. This data pointer contained another pointer than references a list of method structures. The method structures contained a name, a pointer to an implementation, and a few other things. Let's see what Swift does.
First, we know swift does use the objc_class
structure, and in this case it looks like this:
_$s9swift_cmd7PrinterCN:
struct __objc_class {
_$s9swift_cmd7PrinterCMm, // metaclass
_OBJC_CLASS_$__TtCs12_SwiftObject, // superclass
__objc_empty_cache, // cache
0x0, // vtable
__objc_class__TtC9swift_cmd7Printer_data+1 // data
}
There's a slight difference here, it seems that the final pointer, the object data pointer, actually points to an offset from the beginning of the objc_class_TtC9swift_cmd7Prointer_data
structure. If we take a look at that address, we find this:
__objc_class__TtC9swift_cmd7Printer_data:
struct __objc_data {
0x80, // flags
16, // instance start
32, // instance size
0x0,
0x0, // ivar layout
aTtc9swiftcmd7p, // name
0x0, // base methods
0x0, // base protocols
__objc_class__TtC9swift_cmd7Printer_ivars, // ivars
0x0, // weak ivar layout
0x0 // base properties
}
Okay, so far so good right? Very similar to what we've seen in Objective-C, even with the offset. But look a little closer — there's no corresponding list of methods. Uh oh.
Well, we know that methods are associated with class instantiations somehow. But how? Well, let's take a look at the procedures defined in the executable.
Hopper gives us a list of 29 procedures. The two we're interested in are:
_$s9swift_cmd7PrinterC8printMsgyyF
_$s9swift_cmd7PrinterC11printString7messageySS_tF
How do I know? Well, even though the swift-demangle utility has yet to catch up with the name mangling in Swift 5, if you look closely you can see the names of the methods we're interested in embedded in this procedure names. Let's take a look at _$s9swift_cmd7PrinterC11printString7messageySS_tF
— it's the method we call, and if you take a look at the Swift implementation you can see it calls printMsg()
:
func printString(message: String) {
str_to_print = message
printMsg()
}
Perhaps it'll give us a clue as to how methods are defined on classes.
Here's the disassembly:
_$s9swift_cmd7PrinterC11printString7messageySS_tF:
push rbp
mov rbp, rsp
push r13
sub rsp, 0x48
xorps xmm0, xmm0
movaps xmmword [rbp+var_20], xmm0
mov qword [rbp+var_28], 0x0
mov qword [rbp+var_20], rdi
mov qword [rbp+var_18], rsi
mov qword [rbp+var_28], r13
mov qword [rbp+var_30], rdi
mov rdi, rsi
mov qword [rbp+var_38], rsi
mov qword [rbp+var_40], r13
call imp___stubs__swift_bridgeObjectRetain ; swift_bridgeObjectRetain
mov rsi, qword [rbp+var_40]
mov rdi, qword [rsi]
mov rdi, qword [rdi+0x60]
mov r13, qword [rbp+var_30]
mov qword [rbp+var_48], rdi
mov rdi, r13
mov rsi, qword [rbp+var_38]
mov r13, qword [rbp+var_40]
mov rcx, qword [rbp+var_48]
mov qword [rbp+var_50], rax
call rcx
mov rax, qword [rbp+var_40]
mov rcx, qword [rax]
mov rcx, qword [rcx+0x78]
mov r13, rax
call rcx
add rsp, 0x48
pop r13
pop rbp
ret
We have two CALL opcodes. The first calls out to an external function, bridgeObjectRetain()
. That's not really what we're interested in, so let's take a look at the second:
mov rcx, qword [rbp+var_48]
mov qword [rbp+var_50], rax
call rcx
The second CALL references the RCX register, which is loaded with an address from the stack. Yes, from the stack. This is a MAJOR change from Objective-C, and we'll need to start up LLDB to look at this a bit more closely.
Opinions expressed by DZone contributors are their own.
Comments