One Million Ways

Feb 28

Computer Architecture Basics

A mumble-jumble of computer architecture basics to help me remember things for the midterm.

A signed number is a representation of numbers that can be both negative and positive. Unsigned numbers refer to only positive numbers. This is a topic of interest in computer architecture as a computer’s most basic instructions are viewed in 1’s and 0’s or numbers of base 2 (binary). A series of these numbers can formulate instructions that are interpreted and executed. Sometimes, though, a computer may need to execute an instruction with both negative and positive numbers. 

A first attempt to represent negative numbers was to simply add an extra sign in to indicate when something was negative. It was hard to determine where the extra sign should be located and held the problem that 0 could be positive OR negative. This attempt is referred to as sign and magnitude.

All modern computers today have adopted the use of two’s complement which basically has a number’s most significant bit be a 0 if it is positive and a 1 if it is negative. The number 0 is positive at all times.

Positive 4: 0000 0000 0000 0000 0000 0000 0000 0100

Negative 4: 1111 1111 1111 1111 1111 1111 1111 1100

An easy way to negate numbers is as follows:

Invert the bits and add 1.

00000000000000000000000000000100 = 4 base 10

is

11111111111111111111111111111011 + 1 (base 2) =

11111111111111111111111111111100 base 2 or -4

As mentioned previously, instructions are seen in binary representation and each piece put together forms the entire instructions. Registers are constantly being used so they have been mapped to specific numbers too. 

In the MIPS assembly language, registers 16-23 map to $s0-$s7 and registers 8-15 map to $t0-$t7

There are 5 steps in executing an instruction

  1. Fetch instruction, increment PC
  2. Decode instruction, read registers, calculate branch offset
  3. ALU, compare regs, take branch or not
  4. Memory read, lw/sw
  5. Write back to registers

Feb 27

All About Caches

In an attempt to study thoroughly for my midterm in computer architecture, I will be giving a lesson on caches via tumblr.

Overview:

A cache is a memory used by the CPU that stores copies of values or instructions for faster, more efficient access and usage. The cache is a relatively expensive piece of hardware which is why it is limited, especially for general consumers, but can increase performance in a computer by using the locality of reference principle. It decreases average access time to the slow main memory.

Performance Measures:

Hit Rate- percentage of hit

Hit Speed- how fast are the hits

Miss Penalty- How slow are the misses

Principle of Locality:

Spatial locality- locality principle that if a data location is referenced, it is likely that nearby data locations will be referenced also. 

Temporal locality- the locality principle that if a data location is referenced, it is likely to be referenced again soon.

Cache Entry Structure:

Index- describes which cache slot the data is in

The index length is  \lceil \log_2(\text{cache rows}) \rceil bits


Tag- The tag contains the most significant bits of the address. The are checked with the current slot to see if it is the block we need or something we don’t that happens to have the same slot #.

The tag length in bits is calculated by: (addresslength) - (index length) - (blockoffsetlength)

Block Offset-Specifies the desired data stored within the data block within the cache row

 The block offset length is \lceil \log_2(bytes\_per\_data\_block) \rceil bits.

Flag Bits:

There are two flag bits that I will talk about. The first is the valid bit. 

The valid bit is added onto a cache line and initialized at zero, when the cache is cleared. Once data is brought into the cache, the valid bit of that line is turned to 1. Basically, the valid bit helps determine if valid data has been loaded into the cache line yet or not. It generally isn’t turned back to zero again unless the cache is reset.

The dirty bit is another flag bit added onto a cache line to indicate whether the data on the line has changed since it was initially read from main memory. This helps increase performance when deciding when to write back data to main memory, which leads us to the next topic of writing operations in a cpu cache.


Write operations:

The write-through approach- when a write operation takes place, the cache and main memory are updated synchronously. This can prove to be slow as performance is based on main memory speed (ouch). A possible solution to its poor performance is implementing a write-buffer which basically serves as a queue for writing things back to the main memory, allowing the processor to move on with its life and pending operations. Another negative impact the write-through with a write-buffer may have is that the write-buffer uses the bus on every single write, which can negatively impact bandwidth.

image

The write-back approach- the write-back approach makes use of the dirty bit that was previously mentioned and is therefore a bit more complicated to implement. This approach only updates the cache and turns on the dirty bit once a modification is made. If and when the block of data needs to be replaced, THEN modifications are written to the cache if the dirty bit is on. A problem with this approach has to do with snooping caches where the incorrect data can be sent to I/O and such because it is hard to keep track of what has been updated in the cache and main memory.

Direct Mapped Caches:

This is a scheme where every address’ value in main memory has one specific slot in cache where it can be. Hit speed is good since there’s not much searching but the miss penalty is bad since if the data is not in the cache it has to be read in, which can happen often.

To find out which cache slot a main memory block is mapped to:

(main memory block #) modulo (# of cache lines)

Fully Associate Caches:

Scheme where a block can be placed in any location in the cache. Hit speed is slow because every single slot needs to be search in the cache to find the right data, or you can search through all entries at once (parallel), if you want to pay the $$.

Set Associate Caches:

This scheme helps hit rate but worsens the miss penalty. There is a fixed number of locations where each block can be, kind of like a direct-mapped cache, except each slot number has n-lines. 

To find out where a block of memory goes here :

(Block #) modulo (# of sets in cache)

Types of Misses:

Compulsory Misses-  cache miss caused by the 1st access to a block that has never been in the cache

Capacity Misses- cache miss that occurs because cache cannot contain all blocks needed to satisfy request

Conflict Misses-cache miss that occurs in set-associate and direct mapped when multiple blocks compete for same set and are eliminated  in a full associate scheme cache with the same size

image

Feb 24

Direct-Mapped Caches

What is a direct-mapped cache? 

It’s a scheme where each block from main memory has 1 place to appear in the cache - each mem block can be mapped to only one slot but each slot can receive more than one block.

To Find out which cacbe slot a main mem block is mapped to:

cache slot = (main mem block #) mod (# of cache slots)

-To keep track of which possible main mem blocks are in each cache slot, an x-bit tag field is added to each cache slot

-Diry bit - keeps track of whether or not a block has been modified while still in a cache

A slot that is modified must be written back to main mem before a slot is reused for another block

-Mapping main mem blocks to cache slots is done by dividing the mem address into fields: tag, slot and word.

-If the valid bit is 1, the tag field is referenced, it it’s a match, a word is taken from the position in the slot specified by the word field.

-If tags are not the same, the slot is written back to main mem if dirty bit is set and corresponding main mem block is read into slot.

Feb 13

Big O Notation

Found a cool article with great example on Big O notation, something that confuses many. Keep on learning!

A beginner’s Guide to Big O Notation

Feb 08

Read + Append a file in Perl

Reviewing some old script I wrote up when I started as an intern this semester, merely a month ago, I saw that I was opening a file to read it and do some functions, closing it, and then reopening it for appending data. I figured there must be a more efficient way and found that using +» is good for reading AND appending. A few minutes later, I discovered that my new script wasn’t properly running, and debugging, found that my problem was the read and appending. Apparently, appending moves the pointer to the very end of the file (since it’s appending), so it’s no longer looping through each line, as stated in my while loop, because the pointer is already at the end.

I found a seek method that I’ll be looking into and will hopefully have some success as soon as I stop blogging.

seek

Feb 05

Finding the Previous Node in a singly linked list

A doubly linked list makes life a whole lot easier to traverse back and forth but professors like giving weird assignments like these from time to time I suppose.

Algorithm:

Java:

public SingleNode findPrev(int data) {

SingleNode tempNode = head;

for (int i = 0; i < length(); i++) {

if (tempNode.getNext.getData != data) {

System.out.println(tempNode);

return tempNode;

}

.

.

.

Router Installs

Headed to an empty office late last night to install two fresh routers for a more dependable wifi signal. The loft-style office already had its cabling done which was convenient except we did not have a ladder to climb up and place the routers on in the storage compartment. We managed to set up an old plastic buffet table for me to climb on which gave out at one point, but luckily didn’t break. The legs just closed. It was a funny situation but resulted in success.

DNS Check in Perl

Latest small project in perl was an easy DNS check that turned into a bigger battle when moving it to a linux box. Everything has been solved and I’m pretty excited.

I used some big, public name servers to run the checks and make sure hostnames were pointing to where they should be. I decided against using Net::DNS and backticked the host script because I’m lazy and my boss told me to. If anything is wrong, I send an email using Net::SMTP. I originally wanted to use MIME::Lite but had some issues.

.

.

.

my @name_servers = (”8.8.8.8″, “156.154.70.1″, “4.2.2.1″, “8.8.4.4″,”208.67.222.222″);

foreach my $server (@name_servers) {
while ( my ($key, $value) = each %dns_check) {
my $found_addr = `host $key $server` or die “problem! $!\n”;
if ($found_addr =~ m/(\d.*)/) {
if ($1 ne $value) {
#send email if address pulled doesn’t match what it should be

White Mountains, 4K Footers

Someday I’d like to complete this list (hiking). I’ve only got 3 mountains down as of January 2012, but I’ll keep updating this and we’ll see how long it takes. I’d like to try out winter hiking too, maybe next year. School takes up a lot of my time but when I graduate, I will be hiking a lot more often. A secret dream of mine is to hike the AT through but I’m not sure if that will ever be possible for a number of reasons. I’d like to at least knock out the White Mountain 4k’s, experience the Adirondacks, hit Nevada up,and snowboard in the Swiss Alps. Ah, dreams

SSH Problem Solved

Shits old news. I ended up backticking instead of using the Net::SSH module. Just too many problems and too little time – I gotta impress, you know? I’ve heard an overall negative attitude towards backticking, even though my boss seems to use them…Probably much less secure. oh well, for this time at least..

Wanna see some of the code?

#files we need to check for
my %hostFiles = qw( somefile.txt anotherfile.rtf ;

#ssh to machine
my @files =`ssh $ARGV[0] “ls -a ~/Desktop”`;

chomp @files;

#compare files, send email if anything is missing
my %existing = map { $_ => 1} grep {exists($hostFiles{$_})} @files;

print Dumper([grep { !exists($existing{$_})} keys %hostFiles]);

Up to this point, I am ssh’ing, sticking the files we “need” on the machine in a hash and taking the files actually on the machine into an array and comparing them. If a file isn’t on the machine, an email is sent. Right now, I am using the MIME::Lite module, which has been a pain in the ass to print the array onto (suggestions?). And now that I got Net::Smtp working on the server, I’ll be implementing that instead of creating a stupid log file is wasting time.