Announcements
- Please note that the solution to problem 3.43 is incorrect in the book. The correction:
p. 259, Practice Problem 3.43, line 6. Should state that register %esi is equal to 0x2, and register %edi is equal to 0x3.
For all book errata see http://csapp.cs.cmu.edu/public/errata.html
- Bit-wise operations in C. Shifting, masking. Last slides in "Chap 2: Representing Data"
- The conversion algorithm using shifting and masking (on Wednesday). Last slides in "Introduction to C"
- gdb: using the debugger. Onlinw help.
- casting signed to unsigned: not really needed, but you'll need to understand this for the exam. See "Chapter 2: 2's complement"
About Comp 21000
Computer Science has been described as the "mechanization of abstraction". At the physical level computers are electronic machines that understand just two signals, a high voltage and a low voltage. We call the high voltage ``1'' and the low voltage ``0''. Thus computers are machines that perform simple computations such as arithmetic and logic operations on 1's and 0's. Every computer understands a number of instructions that are encoded as 1's and 0's and uses data that is encoded in 1's and 0's.
Since it is very difficult to program using just 1's and 0's, higher level languages have been invented that are translated into 1's and 0's. Languages such as Pascal, Scheme, and C++ enable the use of abstract instructions to create sophisticated programs. Yet every program written in one of these languages must be translated into 1's and 0s 'before it can be executed on a computer.
There is a language level that is "lower" than these high level language called assembly language. Assembly languages are less sophisticated than the higher level languages that you have used in the past. They resemble the instructions written in 1's and 0's that the computer actually understands. In fact, some assembly languages are simply the machine language with the instruction codes replaced by names.
Since assembly languages are so close to the actual machine language, the use of the language is very dependent on the organization of the computer. In fact, to successfully program in assembly language one must understand how computers are organized.
The goal of this course is to study all levels of computer organization from assembly languages to applications. The focus of the course will be on the software that ties the hardware and software together, i.e., the operating system. To make this study concrete, we will study different ways of interfacing with the hardware from assembly language to system calls to system software.
Warning. This is a difficult course, more difficult than COMP 17100 or COMP 17200. There are a myriad of intricate details that must be mastered. Though each detail is in itself easily grasped, the sum of details can be staggering and you will spent a lot of time completing assignments.
Why organization and assembly language?
So why are you learning organizationa and assembly language? Aren't optimizing compilers better than any programming at writing assembly code? Well, yes, for the most part they are. Still, there are some situations where assembly language is useful. See this wikipedia article for some examples.
But another reason for understanding assembly language is that often pernicious memory bugs can only be found by stepping through the assembly code. You don't have to be able to write code in assembly language to track down these bugs, but you must be able to recognize and walk through such code.
Finally, many of the constraints that drive high-level programming langauges exist because of the way processors are organized and assembly language is executed. Understanding run time stacks, for example, gives you much insight into how functions or methods work. Run time stacks, however, are really an assembly language concept that is best understood at that level.
Why C?
There are many reasons to use the C programming language. It's simple, flexible, and provides easy access to assembly language. The following selection from wikipedia says it well:
One consequence of C's wide acceptance and efficiency is that compilers, libraries, and interpreters of other programming languages are often implemented in C. The primary implementations of Python (CPython), Perl 5, and PHP are all written in C.
Due to its thin layer of abstraction and low overhead, C allows efficient implementations of algorithms and data structures, which is useful for programs that perform a lot of computations. For example, the GNU Multi-Precision Library, the GNU Scientific Library, Mathematica and MATLAB are completely or partially written in C.
C is sometimes used as an intermediate language by implementations of other languages. This approach may be used for portability or convenience; by using C as an intermediate language, it is not necessary to develop machine-specific code generators. Some languages and compilers which have used C this way are BitC, C++, COBOL, Eiffel, Gambit, GHC, Squeak, and Vala. However, C was designed as a programming language, not as a compiler target language, and is thus less than ideal for use as an intermediate language. This has led to development of C-based intermediate languages such as C--.
C has also been widely used to implement end-user applications, but much of that development has shifted to newer languages.