Thu, 3 October 2013
In Defense of Boxing/Unboxing 1. Boxed values take up more memory.A boxed value resides in the heap. That means that we need a pointer (32b or 64b) from the stack to our reference-type in the heap as a well as a sync block index(32b). This means a boxed int32 now takes up between 92b or 128b. 3-4 times the space! Ouch!
2. Boxed values require an additional readValues on the stack are right there. Stick and move, stick and move! To fetch a boxed value you must first get the pointer, then look up the object. This means boxed values are slower, in addition to being larger.
3. Short-lived values clog the heapWhen an item is popped of the stack. It's gone. Gone Daddy Gone. In contrast, unused boxed values pile up in the heap until the garbage collector decides to do something about it. [caption id="attachment_290" align="aligncenter" width="317"] It adds up...[/caption]
4. Boxing and unboxing operations takes time/cpuBoxing requires allocating space in the heap and copying the value from the stack. Unboxing is cheaper since you just need to get the address of the fields inside the boxed instance and you can skip the allocation, but you usually end up copying the value data from the heap back to the stack if you want to use it. According to MSDN: "[Boxing] can take up to 20 times longer than a simple reference assignment. When unboxing, the casting process can take four times as long as an assignment."
5. CastingCasting isn't free, but it's generally considered to be in the "Don't worry about it" category of performance hits. Use a profiler people! The real problem with casting is that you get no compile type safety checks. Check ahead or be smote by InvalidCastExceptions.
6. Implicit BoxingOkay, so boxing/unboxing is big, slow, and ugly...but it's also sneaky! Consider the following code: var collection = new ArrayList(); for(var i = 99; i > 0; i--) { collection.Add(i); } It looks innocuous enough, but that "Add" function ends up performing 99 box operations. Here's the relevant IL: IL_0000: newobj instance void [mscorlib]System.Collections.ArrayList::.ctor() IL_0005: stloc.0 IL_0006: ldc.i4.s 99 IL_0008: stloc.1 IL_0009: br.s IL_001c IL_000b: ldloc.0 IL_000c: ldloc.1 IL_000d: box [mscorlib]System.Int32 IL_0012: callvirt instance int32 [mscorlib]System.Collections.ArrayList::Add(object) IL_0017: pop IL_0018: ldloc.1 IL_0019: ldc.i4.1 IL_001a: sub IL_001b: stloc.1 IL_001c: ldloc.1 IL_001d: ldc.i4.0 IL_001e: bgt.s IL_000b This is one of the reasons why using an ArrayList will get your wrist slapped in a code review.
7. They're (almost) unnecessary!Most discussions on boxing/unboxing in .Net focus on old skool data structures like ArrayList and HashTable. These objects were the de facto (and de jour!) collections before .NET 2 came along and saved us all with generic collections like List and Dictionary. And it was good! Straight from the horses msdn: "Generics allow you to define type-safe classes without compromising type safety, performance, or productivity." We get all the benefits of the ArrayList and HashTable collections without having to box or unbox.
But about that almost... In Defense of Boxing and Unboxing7 deadly sins aside, there are some good reasons that boxing/unboxing are still around. This is what we came up with, with a little hint-tweet from @jonskeet. 1. Legacy CodePre .NET 2.0 you're stuck with ArrayList and HashTables, unless you want to roll/download something custom. Box away. 2. 3rd Party LibraryIf their function takes an object, you end up passing a reference type. No use sulking about it. 3. .NET internals.Net notably makes use of boxing and unboxing with the dynamic keyword, and reflection would be pretty tough without boxing. :) That said, these things are typically used for productivity over performance. 4. Mixed value/refernces type CollectionsThe Console.WriteLine overload that takes a string and Object params is a great example. The params let you pass an arbitrary number of arguments that are used to populate values in your string. These params could be of value or reference type, so you're stuck with their common ancestor System.Object which ends up boxing the value types. However, I have yet to see an application whose biggest performance bottleneck is writing to stdout. ConclusionM$FT did a great job designing C#, and they've done an even better job maintaining it. Between the generic collections added in .NET 2 and the ToString() trick I have a hard time getting behind this big, slow, and ugly contender but it still has it's places.
Direct download: coding-blocks-episode-002.mp3
Category:Software Development -- posted at: 12:00am EDT |