Standard cells: Looking at individual gates in the Pentium processor

211 points by todsacerdoti 3 months ago

> Intel started using automated place and route techniques for the 386 processor, since it was much faster than manual layout and dramatically reduced the number of errors. Placement was done with a program called Timberwolf, developed by a Berkeley grad student. As one member of the 386 team said, "If management had known that we were using a tool by some grad student as a key part of the methodology, they would never have let us use it."

The grad student was Carl Sechen, advised by Alberto Sangiovanni-Vincentelli.

https://ieeexplore.ieee.org/document/1052337

chasil 3 months ago

The Computer History Museum has an interview with the i386 designers, where this arrangement was discussed. Carl Sechen's name is not mentioned.
https://archive.computerhistory.org/resources/text/Oral_Hist...
"...we finally made the decision that we should go with automatic place and route. Neither one of those things existed at Intel and the concern was could we get it done in time and would it blow up the areas of the chip so that they wouldn’t fit and then it would all fall apart and we’d have to do it by hand. So what we did, we got an automatic placement program from a grad student at Berkeley, it was called Timberwolf and we checked it out and it seemed to do an adequate job so we had his software. He moved to MIT to work on another project and we actually had a terminal set up in his campus room where he’d fix bugs in the auto placement program as they came up. But luckily the whole thing came together and worked. There are several points in time where we’d get stuck and have to be waiting for him to fix his program. So that would take the individual cells and put them within a rectangle in an optimal situation for speed.
"...I was just going point out that if management had known that we were using a tool by some grad student as the key part of the methodology, they would never have let us use it."
EDIT: I didn't realize that Right-o had an article on i386 place and route with standard cells that also links to the panel interview. The specific areas of the i386 die that used standard cells are identified.
https://www.righto.com/2024/01/intel-386-standard-cells.html
- oldgradstudent 3 months ago
  
  Weird that nobody bothered to do a simple google search.
  "Bart, don't make fun of grad students. They just made a terrible life choice."
andai 3 months ago

"At Intel, nobody knows you're a grad student."
- justinjlynn 3 months ago
  
  At Intel, nobody pays the grad student proportionally for their massive contribution to the bottom line.
  
  robertlagrant 3 months ago
  
  [flagged]
  
  account42 3 months ago
  
  Aka how to let the world be run by those who seek profit for themselves without consideration about the negative externalities.
  
  robertlagrant 3 months ago
  
  How to let the world be populated by people who can make free decisions about their work and their employer. Nothing to do with negative externalities, unless you think feudalism was particularly green.
  
  analognoise 3 months ago
  
  The two sides (A grad student, Intel) are not operating on an equal field. As such the "free decisions" aren't really a valid dimension for "free dimensions", as we have to include how unfair the field is in this case (...in all cases).
  Intel did 152 BILLION DOLLARS in stock buybacks over 35 years and ruined their own lead. They screwed/squeezed a lot of people so they could juice the stock price - to the point where they lost the semiconductor lead (as that wasn't what they were most worried about - they were mostly concerned with stock price). We need to consider all of these things when we determine if the field is fair for an employee.
  
  robertlagrant 3 months ago
  
  Of course it's not equal - it shouldn't be. If you want to be able to seriously negotiate with Intel as an equal, you have to do a lot more than be a grad student.
  But it's free. Which it should be.
  
  analognoise 3 months ago
  
  We set the market, and should be equalizing the power between the two with regulation.
  Extensive regulation.
  
  robertlagrant 3 months ago
  
  Why would one grad student have the same power as 10000 employees of a company? Surely when you're finishing with ominous repetition you must know something's a bit off[0].
  [0] https://www.youtube.com/watch?v=ToKcmnrE5oY
  
  klyrs 3 months ago
  
  This turned into a bit of a toxic tangent, didn't it?
  
  Hikikomori 3 months ago
  
  You're funny.

casenmgreen 3 months ago

I can't see any images.

This is because of CloudFlare.

When I go to the page, I get the CF "are you human" check, which I complete.

However, every image load is also getting that check, but those checks are not presented to me - just the image doesn't load because a HTML page is being returned.

andai 3 months ago

The other day I tried to scan a file on VirusTotal. I got an infinite loop of infinitely slow fading "select all the fire hydrants", followed by rejection, 10 times in a row before I gave up.
Almost as though they had rejected me before the captcha and were merely torturing me for their amusement?
Even more bizarrely, VirusTotal presented a second upload form on the captcha page... which itself is captcha-free...
fsckboy 3 months ago

i use uMatrix and I'm well familiar with the Cloudflare "are you human" steps... I am not encountering the problem you describe, and Cloudflare is not listed as being involved on the dashboard
- account42 3 months ago
  
  You do realize that a large part of the web routes through buttflare even if no buttflare domains are involved, right? Whether one of those requests retrurns the buttflare captcha instead of the requested resource is controlled by buttflare and can vary from user to user.
  
  fsckboy 3 months ago
  
  as I said, I'm well familiar with the cloudflare domain "are you human" challenges.
immibis 3 months ago

Ah, the Great Firewall of Corporate America.

gttalbot 3 months ago

> Modern processors, with their nanometer-scale transistors, are much too small to study under a microscope.

So, can we all take up a collection so Ken can get a nice electron microscope, or what?

amelius 3 months ago

Isn't modern EDA software sophiscated enough that it can place transistors as it sees fit, rather than rely on standard cells?

morphle 3 months ago

No. Actually, the state of the art of EDA software is worse.
My project has been to design (and create) better EDA software that will simulate, optimize and therefore can form and place each individual transistor optimally to achieve lower power, higher speed and lower cost. There is only one drawback over all existing EDA software: my EDA tools must run on a ($100k) small supercomputer or FPGA cluster because it deals with a billionfold more transistors than existing EDA software and that takes more compute. It means my software is much cheaper than existing EDA software but will yield much better chips and wafers with much faster, better, cheaper and fewer transistors.
A high level overview of my software is mentioned indirectly in my talk https://vimeo.com/731037615
I'm eager to give a talk on my EDA software as well, please consider inviting me to give it?
Other researchers and companies have proven that optimizing transistor design and placement over standard cell libraries and PDKs can be done, for example:
https://www.micromagic.com/news/Ultra-Low-Power_PressRelease... was done with their own EDA software.
I am very certain (but have no hard proof) that this is what Apple did on their M1, M2, M3. M4 and M5 processors, especially their high end M2 and M5 Ultra chips.
What I'm claiming here is that humanity can design three to four orders of magnitude faster computer chips using at least two orders of magnitudes less energy making chips orders of magnitude cheaper if we only used better EDA software (CAD=> SYM=> FAB) that we use today. Moore's law is not at an end. I'd be happy to provide proof of this, but that takes a bit more effort than a HN comment.
- hansihe 3 months ago
  
  I don't know about this at any detailed level, but doesn't designing standard cells for leading edge nodes involve a lot of trial and error? Is a lot of the issues that can occur even well understood to the level that it can be simulated?
  With the approach you mention, would it involve creating "custom standard cells", or would the software allow placement of every transistor outside of even a standard cell grid? If the latter, I would have trouble believing it could be feasible with the order of magnitude of computing power we have available to us today.
  
  morphle 3 months ago
  
  The best results will be with custom shapes and custom individual placement of every transistor outside standard cell but within the PDK rules. Going outside the PDK rules will be even better but also harder.
  The trial and error you do mostly by simulating your transistors which you than validate by making the wafers. You can simulate with mathematical models (for example in SPICE) but you should eventually try to simulate at the molecular, the atom/electron/photon and even at the quantum level, but each finer grained simulation level will take orders of magnitude more compute resources.
  Chip quality is indeed limited by the magnitude of computing power and software: to design better (super)computer chips you need supercomputers.
  We designed a WSI (wafer scale integration) with a million core processors and terabytes of SRAM on a wafer with 45 trillion transistors that we won't chip into chips. It would cost roughly $20K in mass production and would be the fastest cheapest desktop supercomputer to run my EDA software on so you could design even better transistors for the next step.
  We also designed a $800 WSI 180nm version with 16000 cores with the same transitors as the Pentium chip in the RightTo article.
  
  isotypic 3 months ago
  
  Has this WSI chip been taped out/verified? I must admit I am somewhat skeptical of TBs of SRAM, even at wafer scale integration. What would the power efficiency/cooling look like?
  
  morphle 3 months ago
  
  The full WSI with 10 billion transistors at 180nm has not been taped out yet, I need $100K investment for that. This has 16K processors and a few megabyte SRAM.
  I taped out 9 mm2 test chips to test transistors, the processors, programmable Morphle Logic and interconnects.
  The ultra-low power 3nm WSI with trillions of transistors anda Terabyte SRAM will draw a megaWatt and would melt the transistors. So we need to simulate the transitors better and lower to power to 2 to 3 terawatt.
  There is a youtube video of a teardown of the Cerebras WSI cooling system where they mention the cooling and power numbers. They also mention that they also modeled their WSI on their own supercomputer, their previous WSI.
  
  Robin_Message 3 months ago
  
  This sounds exciting but the enormous and confusing breadth of what your bio says you are working on, and the odd unit errors (lowering "a megawatt" to "2 to 3 terawatt), is really harming you credibility here. Do you have a link to a well-explained example of what you've achieved so far?
  
  actionfromafar 3 months ago
  
  Have to agree. It's fine to have past achievements in the bio I guess but if you are looking for money it doesn't hurt to appear focused.
  
  initramfs 3 months ago
  
  https://spectrum.ieee.org/1-bit-llm could lower power consumption of data centers.
  
  osnium123 3 months ago
  
  Are you concerned that going away from standard cells will cause parametric variation, which reduces the value proposition? Have you tested your approach on leading FinFET nodes?
- initramfs 3 months ago
  
  I read this is the software they used to make the MicroMagic https://sourceforge.net/projects/mmi-pd (someone on The Open-Source Silicon Slack mentioned it, I think https://open-source-silicon.dev/
- initramfs 3 months ago
  
  Hello, I am interested in your research as well as MicroMagics. The Claremont (32nm Pentium) and MicroMagic are the only application processors that have utilized NTV by stabilizing the voltage at 350mV-500mV. I started a project to make solar powerable mobile devices https://hackaday.io/project/177716-the-open-source-autarkic-... My email is available in the github's linked.
  There are other processors, such as Ambiq Micro, but they are Cortex M4 and M55: https://www.top-electronics.com/en/apollo510-soc-250mhz-3-75...
  
  morphle 3 months ago
  
  We've designed $0,10 ultra low power 8/16/32/64 bit processor SoC with built in MPPT buck/boost converters so they can be powered directly from single solar cells or small charge Li-ion cells. They have low power networking so you can create clusters. I'm not sure yet if they will be even lower power than you 5 mW processors but 1 mW is our aim.
  I would argue that a solar powered computer would benefit from a 2 megabyte (could be as low as 128KB) SRAM operating system with GUI like Squeak or Smalltalk-80 instead of a Linux as you propose. We've learned a lot from the low power OLPC designs.
  Thanks for the invite, I'm eager to collaborate on your solar powered computers but I'm having trouble finding your email in your githubs. Could you email us morphle73 at g mail dot com?
  
  initramfs 3 months ago
  
  Thanks, it is giovanni dot los___ at gmail.com (it should be in your inbox)
  Also, Ambiq Micro has a 2MB MRAM (+2.75MB SRAM) processor https://www.top-electronics.com/en/apollo4-blue-plus-192-mhz... that can run on solar power (it uses ~5uA/MHz, Cortex M4) and Andreas Eriksen got LISP text editor to run on 384K RAM with the Apollo3 https://hackaday.io/project/184340-potatop and 4'4" Memory In Pixel Display.
  I read that lithium-ion capacitors have much faster charging than regular li-on. https://www.tindie.com/products/jaspersikken/solar-harvestin... (and longer lifespan)
  
  RF_Savage 3 months ago
  
  That sounds very suitable for solar powered IoT and IIoT sensors, so the talk about GUI's feels confusing. Zephyr or freertos are perfectly fine with sub-2meg amounts if SRAM.
  
  kragen 3 months ago
  
  this is pretty exciting! i agree about the squeak-like approach. what would you use for the screen? i've been thinking that sharp's memory-in-pixel displays are the best option, but my power budget for the zorzpad is a milliwatt including screen, flash, and keyboard, not just the soc
  
  morphle 3 months ago
  
  There are ultra low power ePaper displays that only need power to change pixels but need no power to light the display or hold an image. They are usually black and white or grayscale.
  > Typically, the energy required for a full switch on an E-Ink display is about 7 to 8mJ/cm2.
  >The most common eInk screen takes 750 - 1800 mW during an active update
  The Smalltalk-80 Alto, the Lisa and the 128K Mac had full window GUIs in black and white and desk top publishing.
  The One Laptop Per Child (OLPC) had low power LCD color screens especially made for use in sunlight and would combine nicely with solar panels.
  
  kragen 3 months ago
  
  hey, i've been looking for those numbers for years! where did you get them?
  the particular memory lcd i have is 35mm × 58mm, which is 20cm², so at 7½ millijoules per square cm, updating the same area of epaper would require 150 millijoules to update if it were epaper. the lcd in fact requires 50 microwatts to maintain the display. so, if it updates more than once every 50 minutes, it will use less power than the epaper display, by your numbers. (my previous estimate was 20 minutes, based on much less precise numbers.) at one frame per second it would use about a thousand times less power than epaper
  so in this context epaper is ultra high power rather than ultra low power. and the olpc pixel qi lcds, excellent as they are, are even more power-hungry
  pixel qi and epaper both have the advantage over the memory lcd that they support grayscale (and pixel qi supports color when the backlight is on)
  
  morphle 3 months ago
  
  >hey, i've been looking for those numbers for years! where did you get them?
  I just googled "low power epaper" and read the summaries for mentions of mW and J
  
  kragen 3 months ago
  
  can you link to the sources that those numbers came from? i wouldn't be surprised if serp summaries were llm-generated nonsense
  
  kragen 3 months ago
  
  like, when i search for that phrase i get a lot of pages like https://www.marctech2.com/news/e-paper-faq-and-applications which is obviously llm-generated bullshit, some pages like https://goodereader.com/blog/e-paper/swedish-startup-rdot-is... with a conflicting number of 4 millijoules per square centimeter, and no links to actual sources who took measurements
  
  kragen 3 months ago
  
  what's ntv?
  
  morphle 3 months ago
  
  NTV = Near Threshold Voltage
  https://www.anandtech.com/show/5555/intel-at-isscc-12-more-r...
  >At IDF last year Intel's Justin Rattner demonstrated a 32nm test chip based on Intel's original Pentium architecture that could operate near its threshold voltage. The power consumption of the test chip was so low that the demo was powered by a small solar panel. A transistor's threshold voltage is the minimum voltage applied to the gate for current to flow. The logical on state is typically mapped to a voltage much higher than the threshold voltage to ensure reliable and predictable operation. The non-linear relationship between power and voltage makes operating at lower voltages, especially those near the threshold very interesting.
  
  kragen 3 months ago
  
  aha, thanks! the description of the threshold voltage in the text is wrong though; it would imply ambiq’s product wouldn't work at all
- avnd 3 months ago
  
  That’s a very ambitious project to say the least and I’ll bite! Please elaborate
  Also, have you checked out the OpenROAD[1] project? It’s a pretty impressive open source RTL to GDSII flow.
  I went to their most recent meetup at DAC’24 and there’s a great community around the project.
  [1] https://theopenroadproject.org/
  
  morphle 3 months ago
  
  >Please elaborate
  I'd love to but what do you want me to elaborate on?
  We started making EDA tools and simulators (CAD, SYM FAB as Alan Kay says) and designing a wafer scale integration to run parallel Squeak Smalltalk (David Ungar's ROARVM) in 2007 and we are still working on it in 2024 so I estimate 30,000 hours now. I call that very ambitious too.
  >Also, have you checked out the OpenROAD[1] project? It’s a pretty impressive
  No it is not pretty impressive EDA software, OpenROAD software quality is like Linux, " a budget of bad ideas" as Alan Kay typifies it. Openroad is decades old sequential program code, millions of lines of ancient C, C++ and bits of Python programs written in the very low level C language, riddled with bugs and pathes. The tools are bolted together with primitive scripts and very finicky configurations and parametric rules. Not that the commercial proprietary EDA software is any better, that usually is even worse but because you don't see the source code you can't see the underlying mess.
  Good EDA tools should be written by just a few expert programmers and scientists in just a few thousand lines of code and run on a supercomputer.
  So the first ambitous goal is to learn how to write better software (than the current dozens of millions of lines of EDA software code). Alan Kay explains how [1-3]:
  [1] https://www.youtube.com/watch?v=ubaX1Smg6pY
  [2] https://www.youtube.com/watch?v=Kj4fLRm2UC4
  [3] https://www.youtube.com/watch?v=1e8VZlPBx_0
  The second ambitous goal is to (learn to) design ultra low power transistor and free space optics. Learn from the best quantum physicists: [4].
  https://www.youtube.com/watch?v=-dQoImLNgWs
  The biggest problem is you need to get a couple of million investment just to test your software with a few tape-outs.
  I only managed to invest the money for the 30,000 hours of labour so far, you can guess how many millions that's worth.
  
  themoonisachees 3 months ago
  
  I watched the video of your talk and it seems impressive. I work in an IP design firm, though I don't have any decision power there I could get you a foot in the door. If you're interested in trying to convince the brass, could you send an email to Eli dot senn at dolphin dot fr?
- therealcamino 3 months ago
  
  If you have working software that gives orders of magnitude improvements, needing $100k worth of hardware would be no barrier at all. That's a fraction of just the EDA software license budget for many projects.
- anamax 3 months ago
  
  A few decades ago, Quickturn sold a custom-built FPGA cluster to do hardware simulation at the gate level.
  Quickturn was bought by Cadence and now seems to be gone.
  
  FastFT 3 months ago
  
  Not sure if its from Quickturn’s acquisition, but Cadence still makes a hardware simulation platform called Palladium.
- BobbyTables2 3 months ago
  
  Could this be parallelized with a GPU?
  
  morphle 3 months ago
  
  I wouldn't want to do that, GPU compilers are not open source and the hardware is undocumented. As a scientist I feel they are extremely badly designed.
  My transistor and atomic simulation software is extremely parallel but not in the limited SIMD way that GPU's are.
caf 3 months ago

As the article mentions, producing an optimal layout is an optimisation problem where the related decision problem is NP-complete. Even laying out standard cells has to be done using heuristic solutions - blowing up the size of the problem by going from cells to transistors just makes that worse.
The logic is built out of standard gates and logic blocks like flip-flops anyway, so the overhead of using standard cells that implement those building blocks likely isn't too great.
- klyrs 3 months ago
  
  This is more apocryphal than lore, but the understanding I've picked up from EE friends is that standard cells are used because they're proven to work in a given fab process. You don't want your layout software coming up with a trillion different gate prototypes in the midst of laying out your logic circuit!
SuperscalarMeme 3 months ago

I'll give you an alternate take: the compute power available to EDA software has been roughly scaling at the same rate as transistors on a die. So the complexity of the problem relative to compute power available has remained somewhat constant. So standard cell design remains an efficient method of reducing complexity of the problems EDA tools have to solve.
- kens 3 months ago
  
  That's an interesting thought. However, it assumes that the problem scales with the number of transistors, i.e. O(N). I expect that the complexity of place and route algorithms is worse than O(N), which means the algorithms will fall behind as the number of transistors increases. (Technically, the algorithms are NP-complete so you're doomed, but what matters is the complexity of the heuristics.)
  
  morsch 3 months ago
  
  It's worse than that, isn't it? Not only are the algorithms presumably super linear, the transistor count has been increasing exponentially, but the compute power per transistor has been decreasing over time. See e.g. [1].
  Although I suppose if the problem is embarrassingly parallel, the SpecINT x #cores curves might just about reach the #transistors curve.
  [1] https://substackcdn.com/image/fetch/w_1272,c_limit,f_webp,q_... via https://www.semianalysis.com/p/a-century-of-moores-law figure 1
  
  kragen 3 months ago
  
  yeah, that plots single-threaded performance, not total compute power. the point it's making is that now those transistors are going to parallelism rather than to single-threaded performance, and also the compute power per transistor stopped increasing around 02007 with the end of dennard scaling
  your problem doesn't have to be ep to scale to 10² cores
  i suspect it's true that compute power per transistor is dropping because thermal limits require dark silicon, but that plot doesn't show it
themoonisachees 3 months ago

All tools at use in the last gen industry (40-12nm) currently make extensive use of standard cell librairies provided by foundries. I don't expect current gen nor next gen to change anything.
Source: I work in EDA
- rsdfvn1099 3 months ago
  
  I definitely agree with your overall sentiment but if you actually care about your project go with one of the 3rd party standard cell libraries, i.e. https://www.synopsys.com/dw/ipdir.php?ds=dwc_standard_cell
  With the foundry provided library you get what you pay for...
frabert 3 months ago

I don't think it's a software issue -- AIUI the issue is that foundries will only let you use blocks for which the process was tested, or the yields would be unreliable/all over the place
- quotemstr 3 months ago
  
  Really? I can't just call TSMC and say "Hey. Here's a mask. Please print 10,000, thanks."?
  
  morphle 3 months ago
  
  no, the mask must be made within the tight rules of the proprietary (and very secret) PDK of the TSMC Fab for that node. Just getting it certified that it fits the rules will cost millions.
  https://en.wikipedia.org/wiki/Process_design_kit
  
  shrubble 3 months ago
  
  How were the Parallax Propeller guys able to do it? They have a Propeller2 that is in either 40nm or 28nm, IIRC.
  
  RF_Savage 3 months ago
  
  Because those are now trailing edge nodes, so cheaper and easier entry for everybody.
  
  amelius 3 months ago
  
  Interesting. What is the granularity? Is it logic gates, or transistors?
  
  morphle 3 months ago
  
  the granularity is at all levels: standard cells (=gates), transistor shapes, metal layer shapes down to groups of atoms (for quantum dots).
  
  quotemstr 3 months ago
  
  Are all fabs like this?
  
  hansihe 3 months ago
  
  It's probably more of a node thing than a fab thing. You would have a much easier time getting the fab to do random stuff for you on a legacy node compared to a leading edge node.
  Leading edge nodes are basically black magic and are right on the edge of working vs producing broken chips.
  You as a customer would never want to be in a position where you are solely responsible for yields.
  
  morphle 3 months ago
  
  There are only a few Fabs with nodes smaller than 28nm. Yes, all fabs are like that, with exception of a few experimental tiny labs at research institutes or universities.
  
  crote 3 months ago
  
  There are open-source PDKs[0] for education, research, and R&D - but they extremely rare and decades behind state-of-the-art.
  Nobody wants to give away trade secrets, so everything remains proprietary and behind an NDA until it has become completely obsolete.
  [0]: https://www.skywatertechnology.com/sky130-open-source-pdk/
  
  amelius 3 months ago
  
  Maybe in reality it is somewhere in-between, where TSMC says: here is a set of standard cells that we tested; you can use them but if you modify anything then it's at your own risk.
  
  morphle 3 months ago
  
  Maybe in therory, yes. In practise the fab will never allow you to do anything at your own risk because it might contaminate or break their $170 million machine. If you offer a few billion extra to cover that risk, your cheaper off building your own fab instead.
  
  amelius 3 months ago
  
  I'm not convinced that moving transistors around freely would break the machine. How could that possibly happen?
  
  gary_0 3 months ago
  
  It takes years just to figure out how to use the machines to produce actual working chips. They're the most specialized, intricate, expensive machines in the world. They're operated in enormous clean rooms that only allow for 1 half-micron sized particle per cubic foot. No fab is going to run them without exactly following the procedures they have spent billions developing and testing. They are also going to avoid potential delays as much as possible because these machines have to be running for as much of their useful life as possible to recoup the vast expense. The level of risk-averseness is insane, but warranted.
  If you haven't watched this video, I highly recommend it: Indistinguishable From Magic: Manufacturing Modern Computer Chips https://www.youtube.com/watch?v=NGFhc8R_uO4 It's a little outdated now, but very comprehensive and gives you an idea of how totally nuts the chip business is.
  
  haneefmubarak 3 months ago
  
  Think of it less like moving transistors around and more like moving small groups of atoms around (also, you can get single transistor cells for analog and power designs). The processes required to place the rough groups of atoms in roughly specific places involves extreme amounts of energy relative to the size of what you are working with, which is supplied in a combination of chemical, radiation, and thermal forms. As a result, predicting what additional effects attempting to form specific micro-shapes might have is nontrivial. They extensively simulate and then carefully test all of their approved cells on an ongoing basis. Deviating from this could cause unknown issues, including damage, but doing the necessary work to prequalify your custom cells would be prohibitively expensive for most applications.
  NB: this is obviously a simplified explanation.
  
  morphle 3 months ago
  
  Going outside of the PDK rules you risk contamination because a chip machine is a high speed ultra-accurate automated mechanical and chemical laboratory. It sprays extremely corrosive acids and vaporizes metals. Going a nanometer outside of the rules would send droplet or flakes of chemicals flying around at high speed.
  https://www.epfl.ch/research/facilities/cmi/equipment/photol...
  
  oasisaimlessly 3 months ago
  
  It wouldn't, except potentially for macro-level requirements like maintaining the correct fill ratio [1]. Sibling comments are FUD.
  1: https://semiengineering.com/knowledge_centers/materials/fill...
  
  kens 3 months ago
  
  I'm very skeptical of this too. I don't see a mechanism where moving transistors could break the machine. Although I wonder if you could blow up the die testing machine by making a chip that was one big charge pump :-)
  
  rsdfvn1099 3 months ago
  
  It's not obvious what "moving" means in this thread but no foundry would ever let you make a design the grossly violates the design rule checks (DRC). It's hard to imagine how something risking damage of anything would get through all the many, many, checks, but wasting time and money is definitely possible and to be avoided.
  As an aside, few analog layouts will use the "line of diffusion" style that's common in the standard cells. And in analog one can find more exotic transistor patterns, like waffle [1], that aren't used in digital. Many things are possible, but it has to pass DRC.
  [1] https://www.radioeng.cz/fulltexts/2019/19_03_0598_0609.pdf
  
  eternauta3k 3 months ago
  
  Skeptical here as well, no one is proposing a serious mechanism for damage like "you left gate oxide uncovered by silicide so it contaminates the CMP machine" or whatever.
  What I can imagine, is that the foundry only tests their magic OPC algorithm with DRC clean inputs. If your mask isn't DRC clean, who knows what's coming out the other side.
  
  anamax 3 months ago
  
  > If you offer a few billion extra to cover that risk, your cheaper off building your own fab instead.
  You can't build a high-end fab for a couple of billion dollars.

jecel 3 months ago

One difference between the standard cells in the article and the current ones is that the routing channels have been eliminated thanks to the many metal layers we now have. Back then we couldn't really afford to have metal cross the Vdd and ground lines at the top and bottom of the cells so we just stretched the polysilicon lines to the top and bottom edges. Routing was done by continuing the poly into the channel and then connecting cells with metal. This meant that though the decapped poly lines are just one thing in the photos, in terms of design the parts inside the cells are standard and the parts in the channel are custom.

This scheme works even with just poly and one level of metal, but if you have enough metal layers than you can run them through the cells themselves. You just have to avoid the vias that take the inputs and outputs down to the transistors. You have an additional gain if you flip every other row of cells so that the PMOS of two rows have the Vdd rail overlap and the NMOS of two rows have the ground rail overlap.

dailykoder 3 months ago

Every single blog post I've read from Ken was mind blowing. Love his work. Keep it up man!

kens 3 months ago

Thanks! Now the pressure is on :-)

Harmohit 3 months ago

This is so cool! "Dissecting" a processor like this could be a fun educational activity to do in schools similar to dissecting a frog, but without the animal rights issues.

kens 3 months ago

Personally, I think everyone should try opening up a chip. It's easy (if the chip isn't in epoxy) and fun to look inside. You need a metallurgical microscope to examine the chip closely, but you can see interesting features even with the naked eye.
- Harmohit 3 months ago
  
  I didn't know there is such a thing as a metallurgical microscope. What makes them different from biological microscopes? And what is there primary purpose? I am assuming they don't make microscopes just for dissecting chips.
  
  kens 3 months ago
  
  A regular biological microscope shines the light from below. This is good for looking at cells, but not so useful when looking at something opaque. A metallurgical microscope shines light from above, through the lens. They are used for examining metal samples, rocks, and other opaque things.
  An external light works for something like an inspection microscope. But as you increase the magnification, you need something like a metallurgical microscope that focuses the light where you are looking. Otherwise, the image gets dimmer and dimmer as you zoom in.
  
  caf 3 months ago
  
  In some places, you've shown the same part of the circuit both with and without the metal layers. How did you find the same location on the die after taking the die out of the microscope, removing the additional layers and putting it back?
  
  kens 3 months ago
  
  I figured that I would want to study the standard-cell circuits, so I made a detailed panorama of one column of standard-cell circuits with the metal. Then after removing the metal, I made a second panorama of the same column. This made it easy to flip back and forth. (Of course, it would be nice to have a detailed panorama of the entire chip, but it would take way too long.)
  
  _ihaque 3 months ago
  
  Biological microscopes illuminate the sample from below, as the samples are typically transparent. Metallurgical microscopes illuminate reflective samples from above.
  *"Below" meaning "on the opposite side from the objective" - you illuminate _through_ the sample.
  
  fest 3 months ago
  
  Metallurgical microscopes illuminate the sample "from the top side". The actual implementation even goes as far as making sure the illumination happens on the optical axis of the objective (as if the light was emitted from your eyes/camera, reflected from the sample and then seen by your eyes/camera). They are also called reflected light or epi-illumination microscopes.
  Biological microscopes, on the other hand illuminate the sample from the back side (which doesn't work for fully opaque objects).
- userbinator 3 months ago
  
  Discarded RFID cards and the like provide a practically free source of minimally-encapsulated ICs, also often made on an old large process that's amenable to microscope examination.
  
  kens 3 months ago
  
  Having looked at a few RFID cards, there are a couple of problems. First, the dies are very, very small (the size of a grain of slat) so they are hard to manipulate and easy to lose. Second, the die is glued onto the antenna with gunk that obstructs most of the die. You can burn it off or dissolve it with sulfuric acid, but I haven't had success with more pleasant solvents.
wizzwizz4 3 months ago

Decapping a processor produces toxic waste, which has to be disposed of. Processors, properly handled, last a lot longer than frogs, and can be re-used again and again: to a first approximation, processors do not wear out. I would expect that manufacturing a new processor causes more suffering to more frogs than is caused by killing a frog for dissection.
That said: we have video players in our pockets. Sure, dissecting one frog might be a more educational experience than watching somebody else dissect a frog, but is it more educational than watching 20 well-narrated dissections? I suspect not. I don't think we need to do either.

nickpsecurity 3 months ago

There’s also open-source, standard cells for people interested. Here’s some links:

https://www.vlsitechnology.org/html/libraries.html

https://opensource.googleblog.com/2022/07/SkyWater-and-Googl...

omoikane 3 months ago

I am glad "pop culture" links to exactly the song I expected.