Scratching the surface

To provide hands-on experience, Google sent us a brand-new Mini with a license for 50,000 documents. As expected, we received a very complete package, including everything we needed to get our Mini up and running quickly, with minimal fiddling. Naturally, we wanted to take a peek at the Mini's insides, but Google is still not very keen on people trying to break into their systems. As we found in our prior review of the Mini, the case itself is solidly built, with exotic-looking tamper-resistant fasteners, and a large piece of plastic covers the top of the enclosure from front to rear, to inhibit removal .



The current Mini is half its former size, and was delivered with everything we needed to get started.

The Mini's internals reveal that Google has made some changes since AnandTech's last look. The machine's current specs are as follows:

  • Supermicro motherboard (P8SCT)
  • Two 1GB modules of PC4200 DDR2 RAM, running at 533MHz
  • A single 250GB Western Digital SATA2 Hard Drive
  • A 280 Watt PSU
  • A 3GHz Pentium 4 531 (Prescott core)

We questioned whether this hardware was the best choice for a search appliance, a point we touch on later on in the article.



A closer look at what makes the Mini tick.

Unfortunately, our journey of discovery didn't take us beyond the hardware itself. Booting up the Mini, we were greeted with the bootup procedure of Red Hat Linux, which ended at a fitting blue login screen, leaving us completely locked out of the mysteries of what makes Google tick. Google understandably guards its technology carefully from prying eyes.

As before, when installing the Mini, the administrator connects the provided crossover cable to the Mini's admin port to perform initial configuration. This stage of setup is very simple; just assign the Mini a static IP address on the network it'll be crawling, and configure other general network settings. These steps require no special knowledge, since the provided Quick Start guide explains them very nicely. After completing these installation steps, the Mini's administration console is available over the network using a web browser for further configuration. Setting up the Mini's crawler was as simple as giving it some addresses to start from, and adding URL patterns to include and to avoid. For example, if we wished our Google Mini to crawl two separate webservers and a fileserver, we would give the Mini a hyperlink to the websites, and a samba-link (smb://) to the fileserver, and the crawler would get to work.


Some links to the websites are all that is needed for the Mini to get started.

Starting the Mini's actual crawl is as simple as that, but there are many options to provide detailed control. In our case, we let the Mini crawl some of the websites running at our lab, but found that we met the 50,000 page limit in a matter of hours, so we mainly settled for samba crawling in most of our tests.

Index So what does it actually do?
Comments Locked

19 Comments

View All Comments

  • GhandiInstinct - Friday, December 21, 2007 - link

    lol
  • legoman666 - Friday, December 21, 2007 - link

    I would have expected this product to be a few years old with hardware like that. A prescott? seriously. And no RAID?
  • razor2025 - Friday, December 21, 2007 - link

    It's a search engine appliance. The product's main focus is in its software algorithm, not how "fast" the hardware itself is. Why would it need RAID? Any sane network/system administrator will have this box backed up in regular interval to the backup array / server. RAID != back up and this product doesn't need the file system performance either.
  • legoman666 - Friday, December 21, 2007 - link

    I didn't comment about the prescott and the lack of RAID based on a performance concern. The precott is hot and inefficient, why not get something that uses less power (IE, a C2D) even if it doesn't need the added processing power of a newer chip? That way, they could market it as a effiecient device or green or whatever.

    As for the RAID, I am not talking about RAID0 (technically that's not even raid), I was leaning more towards RAID1 or RAID5. They mentioned in the review that it took 36 hours to crawl to the 50000 document capcacity, I'm sure most people wouldn't want their search function down for 36 hours while the engine reindexes because it wasn't backed up. Not only that, but you'd probably have to send it back to Google for repairs with only a single drive. With 2 in RAID1, if one dies, a replacement could easily be swapped in.
  • razor2025 - Friday, December 21, 2007 - link

    Maybe it's an option you can request to Google. As for your take on RAID, you're still treating it as Backup. It would be must simpler if they had a second backup google mini instead. Look, they're charging you for the license per document, not how many mini you have hooked up. Also, it's in a 1U form factor. I highly doubt they can manage to squeeze in another drive to satisfy your "RAID!" obsession.
  • Justin Case - Friday, December 21, 2007 - link

    Backups take time to restore from. RAID1 means no downtime. It *is* a backup, and one that's available instantly.

    It doesn't replace regular, preferably _remote_ backups, but it's a pretty basic feature of any system designed to have zero downtime.
  • reginald - Wednesday, January 2, 2008 - link

    RAID and backup are two entirely different things. No RAID in the world can protect you against the same things as backups can (handling errors, programs incorrectly overwriting data, etc). And backups can never replace RAID to achieve continuous availability.

    Thinking you need no backups because you have RAID is like thinking you need no seatbelt because you've got insurance. They simply aren't the same.
  • rudder - Friday, December 21, 2007 - link

    Prescott performance aside... as the article mentioned this is a 24/7 device... why use such a toaster of a cpu when Core2Duos would not add a whole lot to the bottom line?
  • Calvin256 - Tuesday, January 1, 2008 - link

    If you're looking at the prices as a consumer, that may be the case, but you need to rememeber that Google/Gigabyte is not you or I. When purchasing in bulk those processors can be VASTLY cheaper than we could ever hope to pay, even when they're in the bargain bin at shadyetailer.com. Things made for consumers can easily be marked up 200-2000%, things made for OEMs might have a 50-100% margin.

Log in

Don't have an account? Sign up now