Body:

No Text

AM (myname4rwt.delete@this.jee-male.com) on June 4, 2018 11:00 am wrote: > What started as another writeup continuing my previous posts with thoughts on Intel's 10nm process > affair, which I was sharing here mainly out of frustration for total lack of analysis on this topic, > be it for pro-Intel agenda or lack of required background among tech writers, somehow evolved into > something different. It was originally prompted by Charlie's article, to which I wanted to add > only a few comments in a short post, then I unexpectedly found some free time on my hands and thought > I'd finally put together some of my previous posts at RWT so they are in one place. Time still permitting, > I added some structure and style to the text, giving it articlish looks. > > Have a nice read, folks. > > ------------- > > A nice catch from Charlie who looked at ark pages for i3-8121U. After taking a > look at them, a few more interesting things caught my eye, so here is some more > food for thought, in addition to Charlie's article and what I posted already. > > 1. No datasheet (and not just on ark pages, it appears to be missing on Intel's site -- > try searching and let me know if you manage to find it -- I didn't). Releasing a chip > without a datasheet is a trait shared mainly by dirty little chip cloners from SE Asia, > rather than reputable semiconductor manufacturers that Intel is. Or at least was. > > 2. A whole number of missing features, not just GPU, but also configurable TDP down among others > -- both of which even 3000 series Celerons have (Intel's cheapest mobile offering). > > 3. Junction temperature is 105 °C. The problem is not the absolute value, but the fact > it's 5 °C up from a similar 14nm i3-8130U -- with GPU fused off. I take it as a corroboration > of the thermal gasket effect that use of cobalt in metal stack has on temperatures, essentially > making hot spots even hotter due to 4x poorer thermal conductivity vs Cu. > > Add to this no big public announcement to this day of 10nm products or process as going into hvm (off-hand > comment made at CES hardly counts as such). Consider this: a company which was always full of pride > for the ability to follow Moore's law for decades, starts long-overdue 10nm deliveries without even > talking about them -- no press release, nothing. That CES remark and i3-8121U scores which appeared > on geekbench with no products in sight until just recently generated only questions and theories as > to what's really going on rather than belief that 10nm is finally ready and chips are shipping. > > So why did Intel start supplying their debut 10nm chips in such a quiet fashion at all? > Just because the chip is defeatured to sub-Celeron level and only for the bragging rights > and a historical record that Intel's 10nm shipments started in Q2 2018? > > I don't think there's much reason for Intel to do that; after all, they made the entry for i3-8121U only > following the flood of news reports of the Lenovo notebook spotted in China, and we don't know if that CES > comment was planned or sanctioned by the top management at all, as BK himself made no mention of it. > > While I'm at it, cudos to either someone extremely eagle-eyed or an insider (I'm going with the latter) > who disclosed it first, and it was not one of news sites often quoted as the original source for this news > -- the credit goes to forum member not_someone over at Anandtech forums who broke the news ahead of multiple > sites. If you happen to know an even earlier source, please do mention it in the comments. > > Getting back to the subject, I think Intel's main goals are different with this launch. > > Subtle problem > > I have already said that I don't buy Brian Krzanich's "low yield" stories he tells, likely simply to avoid > any questioning, and speculated as to possible reasons behind Intel's neverending 10nm problems. > > To begin with, Intel introduced a whole bunch of innovations in their 10nm process, one of them being copper-cobalt > stack. It's out of question that if M0 wire cross-section will continue shrinking, sooner or later alternatives > with shorter electron mean free path will offer better conductivity than copper, the question is whether time > for the switch from copper to some alternative, even in lower levels of the stack, has come. > > TSMC, Samsung and GF are all staying with Cu stack at 7nm, and their mmp is the same as Intel's > on 10nm -- 36-40nm. GF are only replacing W with Co for contacts (I haven't seen original paper > and wonder what the purpose is, perhaps to reduce Schottky barrier height and improve drive?) > and make Co liners (probably replacing Ta in order to shrink liner thickness) and caps in several > lower levels of metal stack, and TSMC aren't doing even that I think. > > Regardless of the choice of replacement, Intel's switch from copper seems premature at best. Advances in copper > deposition techniques allow to achieve resistivity as low as 3-4 µOhm·cm for <30nm CD -- that's lower than > bulk resistivity of cobalt (6-6.5 µOhm·cm), and Intel's competitors are probably well aware of that. > > As for the choice of cobalt, one serious thing to consider is that unlike copper, it's brittle. Non-ferrous > metals don't have endurance limit so one could design around mechanical failure from thermal cycling with a > properly chosen safety margin. It fully applies to copper just as well, but it has been used for so long without > major problems attributed to fatigue failure, that I'd hazard a guess it responds with micro-yielding along > grain boundaries once its fatigue strength falls below stress resulting from thermal cycling. I wouldn't expect > by default a graceful fatigue failure from cobalt (as well as other brittle materials in general). > > Besides, cobalt's thermal conductivity is 4x less than that of copper. Using cobalt > in lower levels of the stack is like installing a thermal gasket between transistors > and the rest of stack, effectively making hot spots even hotter. > > It might be the case that Intel's resulting problems are such that one of the things they are facing > is mindboggling variation of reliability and life of their 10nm samples -- some chips work just > fine for months, while others fail or become glitchy after weeks under test, and others crumble > in days). While that's just a hypothesis, this is consistent with two things we know: > > First is repeated promises that 10nm hvm is just around the corner, as execs including BK probably sincerely > believe that something as minor as remaining issues with reliability would be fixed Real Soon Now. > > Second thing consistent with such undercover release of their first 10nm chip is no public announcement, > no datasheet, and through a single OEM in China. If my hunches are correct, then in a situation > like this any manufacturer would be scared like hell of releasing such a nightmare in high volume > and with a big announcement for reasons which need no explanation. > > So what can help when one stepping after another fails at fixing "minor" remaining reliability > issues, and with process as complex as Intel's 10nm, fab time is probably around 2.5 months > (I don't have exact figures, of course, -- that's assuming they work fast averaging a layer > per day) and after that you need time for those lengthy reliability tests to see if you can > finally ship chips in volume, and every time the answer from QA team is "no"? > > Solution > > One thing that comes to mind is a magic wand. No, stop laughing, I'm serious! > > If my theories and speculations turn out in the end to be correct, then Intel's brainpower > apparently doesn't seem to realize what kind of wall they are up against as a result > of their decision to go with heterogeneous copper-cobalt metal stack. > > If in what must have looked to some people at Intel as a touch of a genius you build a stack > - using metals with significantly different thermal expansion coefficients (16.5 for Cu vs 12-13 for Co), > - and one of them being brittle and having 4x worse thermal > conductivity on top making hot spots even hotter, > how in the world are you going to fix that?! > > Even if your eventual indicator of chip health is its performance on a test rig under conditions > closely matching those which are likely to be in the real world, the rig is one thing -- every > single setpoint is your choice, and even if you manage to end up with a stepping which begins to > look if not yet sellable, but at least reliable as long as you don't hit some nasty corner cases, > real world is a different thing -- when thermal cycling is concerned, every app has unique "fingerprints" > -- it exercises various blocks of a chip in a specific manner: activating them with certain probability, > and even that probability is usually a variable, not a constant. > > See what I'm getting at? Different people use different apps, there are millions of apps > and billions of people; some never power down their PC, while those working on the run > can flip the switch a dozen times a day. Wanna simulate that or build a math model for > distribution of service life and probability of a failure? Good luck with that. > > Besides, if you know for a fact you have corner cases when your chip crumbles, you simply can't > guarantee they won't happen with real applications, no matter how unique and improbable the cases > can look (let alone when you know there do exist killer apps, pardon the pun). To add insult to > injury, in real world fans tend to slow down or stop, TIM -- to dry out, VRs -- to fail in different > manners, dirt-cheap or faulty electrolytic caps -- to age quickly, and the list goes on. All of > this starts to matter a lot more for a chip with razor-thin reliability margin. > > One thing you can still try in a situation that I think Intel found themselves in, > is field-testing this flaky nightmare. You can take your best part which begins to > look alive on the rig (never mind it happens only when GPU is fused off), ship it and > wait and see how high the death rate would be. Who knows, maybe you get lucky. > > Needless to say, better do it without any announcements and in some remote place with a language > other than English, so that risk of possible reputational damage is kept to a minimum. > > One interesting question > > An interesting question here is the price at which this chip is sold to Lenovo. Since a problem > with components tends to have seriously detrimental effect on the brand name regardless of what > common sense can suggest, what do you think happens when an OEM is offered an unknown CPU > - which is not on supplier's price list even after "launch", > - for which there's no datasheet even after "launch", > - which is fabbed on a new process which is long overdue and > was not even officially announced as going into hvm, > - and to top things off, whose GPU is fused off for some reason, yet it's still > 15W as otherwise similar 14nm counterpart and Tj is bumped up to 105 °C? > > Right, questions arise, and many of them. Starting with "WTF is all of this supposed to mean?" > and "Why should we bother at all and bet our reputation on this strange piece of something, > among other things crippled to sub-Celeron level, yet still branded as Core i3?". > > That being said, what do you think would be a reasonable agreed-on price for i3-8121U > when dealing with someone of Lenovo caliber? Your guess is as good as mine, and quite > frankly, mine is that the "sales" price for i3-8121U to Lenovo is deeply sub-zero. > > What Intel is likely busy doing these days with their first 10nm product is not selling > for profit, or even shipping for zero-profit revenue as one can think, but > - supplying what really counts as rejects which belong to a scrapyard by any quality > standard (certainly by Intel's of former years), but which -- I think Charlie is spot > on there -- is the best (or perhaps the only thing?) they can offer at the moment, > - paying up dearly for these supplies to make their way into Lenovo notebooks, > - and expecting to use Chinese computer users as field-testers of this flaky nightmare. > > P.S. There are still many interesting questions: will Intel give up on CuCo stack and join > the rest of industry, or will they persist and try to take that wall, erected by their own > hands, by storm? If so, will they finally succeed at that or not? What do you think? > > Note that regardless of the exact nature and source of problems Intel faces, that's been dragging for about > 2.5 years now since 10nm was originally scheduled for 2015 (I lost count how many times 10nm launch was pushed > back already), and with Intel's current 2019 launch plans we're already talking about 3-4 year delay. And that's > not all, as process development programs at Intel run for about 4 years, and those 4 years do not include pathfinding > and component research. That's the scale of investment that's not even starting to pay off.