Background
Computer systems utilize volatile and nonvolatile memory to store and retrieve data. Volatile memory is advantageous because they enable faster access times compared to conventional hard disk drive (HARD DISK DRIVE, HDD) or Solid state drive (Solid STATE DRIVE, SSD) devices. Typically, volatile memory is accessed at least an order of magnitude faster than conventional HDD or SSD devices. Volatile memories are commonly used to store and retrieve data of a shorter duration because they lose data when power is lost/removed.
Nonvolatile memories have the advantage that they retain data even after power is lost/removed. Thus, non-volatile memory is typically used to store and retrieve data for longer durations and to allow other computer systems to access the data.
Due to the current continuous demand for higher access speeds, most computer systems utilize volatile memory to temporarily store data, and non-volatile memory stores and retrieves a combination of data over a longer period of time (including, for example, seconds, minutes, hours, days, months, and even years).
Volatile and nonvolatile memory are typically assembled from individual die (die) of a semiconductor wafer manufactured by a particular photolithographic process. Typically, these wafers and their dies are tested for proper electrical operation and function according to a given voltage (direct current) and timing (alternating current (ALTERNATING CURRENT, AC) level) parameter specification. Once one or more known good die are identified within the wafer, these die will pass through a packaging process so that the final product can be built using a single volatile and non-volatile memory device package.
Both volatile and nonvolatile memory can be packaged and assembled using a single die, as well as dual, quad, and octa dies, and more. In some cases, the package includes a monolithic or multiple dies disposed in the subsystem.
Manufacturers and suppliers desire that volatile and/or non-volatile memory be reliable, robust, and operational over the life of the product. To help ensure these qualities, memory components are often subjected to rigorous testing. Given the complexity of volatile and non-volatile packaging, it is conceivable that robustness testing is important for building subsystems and ultimately fully assembled systems. Because one or more dies within a volatile and/or nonvolatile memory package may present manufacturing-related problems (e.g., open or short circuits due to process and setup variations during assembly and manufacture of the package), it is desirable to test and detect open signals (i.e., not connected to an intended point) or short circuits (i.e., erroneously connected to nearby power, ground, or other signals), resulting in undesirable behavior. Poor connections to the memory module printed circuit board (printed circuit board, PCB) may also exist due to mechanical stress, manufacturing problems, thermal problems, electrical problems, or any other manufacturing related process variation. These false open, short and connection problems can be detected using appropriate dc level tests.
Many companies offer generic semiconductor test equipment that is custom-built to test the voltage, frequency, and temperature of a particular memory using application-specific integrated circuits (ASICs) or field programmable gate arrays (Field programmable GATE ARRAYS, FPGA).
In general, such generic testers are limited to testing volatile memory or non-volatile memory, but are not capable of testing both simultaneously. General testers are also limited in that they can only implement fixed functional test patterns according to existing industry standards using Automatic TEST PATTERN Generation (ATPG) patterns. In summary, these limitations are problematic because no memory has been tested in the end user product running the end user software. Nor do they replicate the situation in which a memory failure occurs in an end-user product. Thus, there is no way to use a generic tester to effectively predict the needs of an end user or fix the problems of many end users.
US 7,707,468 of Volkerink describes a test apparatus in which memory controllers are arranged in a star configuration in which each of a plurality of interface boards operate in parallel to test a plurality of memory modules. This arrangement allows for simultaneous testing of various memory modules at different rates of operation, but the patent remains silent with respect to simultaneous testing of volatile and/or nonvolatile memory. Furthermore, the interface boards are not intelligent, so they cannot operate according to a number of different test modes downloaded from the memory controller. There is also no teaching, suggestion, or motivation to test memory modules while they are running application software.
Accordingly, there is a need for a semiconductor test system that (1) is capable of testing volatile and non-volatile memory devices, (2) has one or more dies in a package, and (3) is in a test environment that closely simulates the host system (i.e., personal computer (personal computer, PC), notebook, server, cloud, etc.) and software of an end user. Ideally, such a system would provide detailed AC and DC level test coverage, as well as functional testing at normal operating temperatures, high and low temperatures, high nominal low voltage levels, and high nominal low operating frequencies, to adequately stress, exercise, and perform real world functional testing at the operating speed of the host system. Such a system should be capable of running end-user applications on a local Operating System (OS) capable of executing one or more applications to simulate a target host application environment.
Disclosure of Invention
The inventive subject matter described herein provides dynamic real-time testing for volatile and non-volatile memory arrays. In contemplated devices, systems, and methods, a memory tester unit can dynamically reconfigure (a) its Input and Output (I/O) pin voltage levels, and (b) its operating frequency without any system restart requirements, and perform different real-time test modes for different volatile product families and/or different non-volatile product families.
Contemplated memory tester units include several components as described below. A master controller unit (Master Controller Unit, MCU) manages one or more slave controller units (Slave Controller Unit, SCU) capable of testing volatile and non-volatile memory. Each SCU is capable of executing one or more different real-time test patterns on one or more fully isolated memory channels of the same or different types of memory. By utilizing this modular approach, the MCU may program one or more SCUs in parallel to perform ganging/parallel testing on one or more memory devices, which increases productivity and testing speed and reduces overall testing costs. A single operator may load the appropriate test pattern into one MCU, and the MCU may upload the test pattern into one or more SCUs, each of which operates in the desired real-time test pattern, and each SCU memory channel tests one or more memory devices, whether or not the single memory being tested has a single chip, dual chip, four chip, eight chip, or other number of dies.
Main control unit
As described herein, the MCU may be connected to each SCU through a conventional wired or wireless method to program and initialize the SCU units. The MCU may advantageously use an intelligent, fully automatic scanner approach to automatically load the appropriate MCU program to load and execute each SCU. The automated process aims to further increase utilization, reduce human error, and improve overall test costs.
As described above, each memory tester unit may have one or more MCUs, where each MCU is connected to one or more SCUs, and each SCU may address one or more memory channels. It is also contemplated that each memory channel may have one or more device under test boards (Device Under Test, DUTs), where each of the DUTs has one or more sockets to test a single memory package of volatile memory or non-volatile memory. It will be apparent to those having skill in the art, upon learning the present application, that hundreds of memory packages may be tested at any given time using a single scanning program.
Upon starting the test profile, the MCU may apply one or more operational tasks, programs, functions, or algorithms to the one or more SCUs to detect undesirable manufacturing, assembly, or packaging defects of one or more dies of the volatile and/or nonvolatile memory package having open and short circuits. SCU and analog and digital logic may use various devices to detect these anomalies and error faults in volatile and/or nonvolatile memory packages. In one embodiment of the invention, the SCU may inject and control current into each device socket under test (DUT) pin and its associated volatile and/or nonvolatile memory package to measure its corresponding read voltage from the internal protection circuit diode of each volatile or nonvolatile memory package pin to see if there is current continuity from the SCU input/output pin to the volatile and/or nonvolatile memory device connected to the DUT socket. If the device has the correct connection, the SCU will be able to read the correct voltage drop over the internal protection diode and detect the correct connection. Otherwise, if the reading is zero, the SCU knows that the pin is open and faulty. In another embodiment, the SCU may inject current in one pin of a DUT having a volatile or nonvolatile memory package and read the appropriate voltage drop in adjacent pin locations of the same volatile or nonvolatile memory package to detect a short to nearby power, ground, or other adjacent nearby pin signals, thereby detecting potentially shorted and failed volatile and nonvolatile memory packages.
The MCU may be connected to a power management unit (Power Management Unit, PMU) configured to increase and decrease input voltages and currents to the SCU I/O pins, the voltage rails of the DUT memory sockets, and the rest of the circuitry. This flexibility allows the MCU to enable the same ecosystem to test different memory device packages requiring different operating voltage levels. (i.e., DRAM (Dynamic Random Access Memory, DRAM) 5.0V, SYNC DRAM (Synchronous Dynamic Random Access Memory,SDRAM)3.3V、DDR(Double Data Rate SDRAM)-I 2.5V、DDR-II 1.8V、DDR-III 1.5V、DDR-IV 1.2V, etc.)
The MCU may also be connected to a clock management unit (Clock Management Unit, CMU) configured to provide a wide range of clock intervals for SCU core frequency operation along with SCU I/O pins and clock pins of the DUT memory sockets. This functionality allows the SCU to support many different memory technologies that require different operating frequencies. (i.e., SDRAM 100Mhz, DDR-I200 Mhz, DDR-II 400Mhz, DDR-III 800Mhz, and DDR-IV 1600 Mhz)
The MCU may also be configured to program an onboard temperature management unit (Temperature Management Unit, TMU) chamber to provide nominal room temperature, low temperature, and high temperature (i.e., -50 ℃ to +155 ℃) to further apply pressure to the devices under test to better sort and separate good devices and edge devices under test.
The MCU may also be configured to vary voltage, frequency, and temperature to perform solid state triangle testing (solid 3corn testing) when a customer memory test mode is applied, and to run DUT devices under real-time operation like a target customer system or the like.
The MCU may also be configured as an off-the-shelf Personal Computer (PC) or an embedded PC to perform its tasks. The software running the MCU may advantageously perform all task management of control voltage, frequency, temperature together with the SCU. In some embodiments, the MCU may be a dedicated Processing Unit (PU) comprised of one or more microcontrollers, microprocessors, or processor units, with or without an Operating System (OS) running one or more Firmware (FW) to jointly control SCU, PMU, CMU and TMU units.
Slave control unit
The SCU may be configured in a variety of ways, including embodiments in which the SCU may use software running on a PC, hardware running on an ASIC, or any combination of dynamic hardware running on an FPGA. The SCU may be configured with solid state analog and digital logic, a finite state machine (FINITE STATE MACHINE, FSM), one or more processing units (i.e., microcontrollers, microprocessors, processors, etc.), or a combination of one or more of these approaches.
In some embodiments, SCUs 516, 521 are configured to run one or more FWs to manage one or more tasks on one or more channels of each SCU, which may be connected to one or more DUTs and apply one or more series of volatile and/or non-volatile memory test patterns to one or more sockets connected to each DUT unit. This configuration collectively allows the SCU to fully perform parametric testing, functional testing, of each connected socket to each connected DUT on each SCU channel.
In one embodiment according to the inventive subject matter, an SCU has DUTs and sockets for testing volatile memory devices.
In another embodiment consistent with the subject matter of this disclosure, an SCU has DUTs and sockets for testing non-volatile memory devices.
In yet another embodiment consistent with the subject matter of this disclosure, an SCU has DUTs and sockets for testing volatile dual in-line memory module (Dual Inline Memory Modules, DIMM) devices.
In yet another embodiment consistent with the subject matter of this disclosure, an SCU has DUTs and sockets for testing non-volatile module devices.
Contemplated SCU processing units may run one or more microcontrollers, microprocessors, or processors using any standalone bare metal (FW) firmware, mini-operating system, mature (full flash) embedded operating system, or real-time operating system (RTOS).
In general, the inventive subject matter described herein is directed to addressing dynamically programming highly flexible tester units to test the entire ecosystem of semiconductor volatile and non-volatile memories by loading a configuration into an MCU to dynamically program and execute one or many SCU, PMU, CMU and TMUs. The contemplated apparatus, system and method create a fully flexible, dynamic, programmable semiconductor test system to test different generations of volatile and non-volatile memory under different voltages, different frequencies and different thermal oscillations to fully stress. Custom full function tests may be run with real-time applications to enable a user to detect any AC and DC level parameters throughout the acceptable voltage, temperature and frequency ranges, including errors that are undetectable by standard off-the-shelf semiconductor testers.
Firmware
In some embodiments, the SCU may use one or more Firmware (FW) codes through the use of one or more internal processing units, or may perform specific operational tasks, programs, functions or algorithms as stand-alone firmware, scheduler, operating system or RTOS to test volatile and non-volatile memory packages. FW may perform any write, read, modify, read-modify-write or compare activity.
The firmware running on the SCU processing unit may be implemented as one or more families of programming code, as a single firmware program, or as a modular approach to provide for the effective and efficient execution of the intended functions. FW may be accomplished using a low-level programming language (such as assembly language or machine code) or a higher level of abstraction or object-oriented programming (OOP) such as C language (i.e., C, C ++, c#, etc.).
The firmware may also implement one or more operational tasks (i.e., internal housekeeping tasks to ensure that all hardware blocks connected to the SCU are always initialized, started and working properly, and from time to time check the health status of each hardware block), one or more processes (i.e., hardware block status to ensure that the current task at hand and/or queuing task is known and collect any relevant information to monitor, fine tune and report all relevant active internal operational processes to the MCU unit), one or more functions (i.e., performing performance assessment, hardware and firmware operational delays, hardware power consumption, etc. internal functions), and one or more algorithms (i.e., any volatile and/or non-volatile memory test, ac and/or dc level test, functional test, customer ac and/or dc test, functional test, customer application test to improve performance, reduce delay and reduce power consumption), a bare machine(s) with one or more Processing Units (PUs), a microprocessor, or processor, using a stand-alone Firmware (FW), a full-flew operating system, a mature operating system (FPGA), or a real-time operating system (FSM), or any other operating system, or a real-time operating system (rtsystem).
Any firmware algorithm may be converted to its equivalent hardware algorithm for any operational task, process, function, or algorithm executed by one or more Processing Units (PUs), which may be further improved by implementation in purely digital logic, analog logic, ASICs, or FPGAs to increase speed, reduce latency, and reduce power consumption. Such hardware implementations are necessarily more efficient in terms of speed, size, and power, which can significantly improve the performance of the tester unit.
Reducing the size of error logs
The firmware may perform a series of normal memory transactions to access volatile or non-volatile memory packages to perform firmware write, firmware read, firmware modify, firmware read-modify-write, or firmware compare activities. In some approaches, the firmware will access a volatile or non-volatile memory package having an address size of "N" starting from either the beginning (i.e., address zero "0") or ending (i.e., address N-1) to perform any task, program, function, or algorithm including writing, reading, modifying, or comparing. This may be accomplished by accessing each data line (i.e., x4 bit, x8 bit, x16 bit, x32 bit, x64 bit, x72 bit, x-80 bit, etc.) sequentially (i.e., address by address) or randomly (i.e., the address may jump to any location within the effective address range of "0" to "N-1")
Volatile or non-volatile memory packaged memories may be tested by converting linear memory addresses to row and column addresses. Where the memory package has multiple dies, linear memory addresses may be translated to row and column addresses for each die so that a host may conveniently write, read, modify or compare addresses in any size of data.
Conventional test equipment and methods can create very large error logs. A common volatile memory may have a capacity of 8Gbit (i.e., 1Gbit x 8 or 10 billion 8 bit locations internally arranged in rows, columns, and banks). Large implementations have 8 volatile packages with multiple packages, e.g., an 8Gbit (1 Gbit x 8) configuration, commonly used in modern processors with at least 64 bit data bus access. If one or more packages present potential contact problems associated with the manufacturing process, then the memory test must test all rows, all columns, all banks, all packages, and all internal dies of a given memory module to produce all possible errors. Since conventional testers access the memory module in 64-bit fashion, the error log memory required within a common tester unit must have the ability to hold the maximum number of row x maximum column x maximum bank x die locations, each capable of holding 64 bits. This translates to a 1Gig location x 64 bits or a 1Gig x 8 byte=8 GB (8 Gig byte) error log location.
The present invention can greatly reduce the error log and improve the manufacturing-related error detection log by several factors. In one aspect, all possible row x column x memory bank x die are tested for a single data bit (test data bit), and the erroneous data is post-processed for all known manufacturing related issues (e.g., open, short, or bad contact). The process is then repeated for the next test data bit data over all possible address locations until each test data bit has been tested.
Once all rows x all columns x all memory banks x all dies x each data bit are collected and post-processed, the final root cause of the error can be placed in the error log. Depending on the type of error, the error log may be as small as a few hundred bytes. For example, a failure of a test data bit traversing multiple rows, columns, banks, and dies clearly indicates that the test data bit is either OPEN (i.e., meaning that no actual expected test data is written to all relevant addresses, and therefore the failure traverses all addresses) or shorted to power, ground, or nearby adjacent signals (i.e., meaning that a power short "stuck at a failed logic 1" and a ground short "stuck at a failed logic 0"). Thus, by performing a firmware write, a firmware read, a firmware comparison, a test task, program, function or algorithm will detect these well-known problems and mark this test data bit as faulty. The same duplicate errors need not be recorded in the error log file for each combination of row, column, bank, and die address locations. This will greatly reduce manufacturing-related error logs, saving memory space, processing and power consumption compared to other lengthy error logs.
In general, these functions will help the user reduce operating costs, reduce human error, improve yield, reduce system error, and improve system performance.
Detailed Description
FIG. 1 is a schematic diagram of a prior art volatile memory device tester 100, which generally includes a volatile memory tester unit 101 connected to a Device Under Test (DUT) 103 by a connection 102. DUT 103 includes one or more sockets 104#0-104#n configured to communicate with a volatile memory device (not shown). The volatile memory device tester 100 allows an end user to perform standard off-the-shelf parametric tests on one or more volatile memory devices inserted into the sockets 104#0-104#n, respectively.
Fig. 2 is a schematic diagram of a prior art nonvolatile memory device tester 200, which generally includes a volatile memory tester unit 201 connected by a connection 202 to a Device Under Test (DUT) 203. The DUT 203 includes one or more sockets 204#0-204#n configured to communicate with a nonvolatile memory device (not shown). The nonvolatile memory device tester 200 allows an end user to perform standard off-the-shelf parametric tests on one or more nonvolatile memory devices inserted into the sockets 204#0-204#n, respectively.
Fig. 3 is a schematic diagram of a prior art volatile memory DIMM tester 300, which generally includes a volatile memory tester unit 301 connected to a Device Under Test (DUT) 303 by a connection 302. The DUT 303 includes one or more DIMM sockets 304#0-304#n configured to communicate with a volatile memory DIMM module (not shown). The volatile memory DIMM tester 300 allows an end user to perform standard off-the-shelf parametric tests on one or more volatile memory DIMM modules inserted into DIMM sockets 304#0-304#n, respectively.
Fig. 4 is a schematic diagram of a prior art nonvolatile memory module tester 400 that generally includes a volatile memory tester unit 401 connected to a Device Under Test (DUT) 403 by a connection 402. The DUT 403 includes one or more module sockets 404#0-404#n configured to communicate with nonvolatile memory modules (not shown). The nonvolatile memory module tester 400 allows an end user to perform standard off-the-shelf parametric tests on one or more nonvolatile memory devices plugged into the module sockets 404#0-404#n, respectively.
Fig. 5 is a schematic diagram of one embodiment of a tester architecture of an apparatus 500 in accordance with the inventive concepts. The device 500 is capable of testing both volatile and non-volatile memory device units, DIMMs, and module units. The device 500 generally includes test profiles 501 of volatile and nonvolatile memory test modes, a network 503, an MCU 505, PMU 509, CMU 510 and TMU 511, SCU 516, 521, and a plurality of DUTs 519, 520, 522, 523.
The test profile 501 communicates with the network 503 via a communication line 502, and the network 503 communicates with the MCU 505 via a communication line 504, the communication line 502 may be wired or wireless. The network may be any suitable network including, for example, an intranet or an extranet.
The MCU 505 is configured to send a copy of any test configuration file 501 to one or more SCUs 516, 521 via communication line 515. MCU 505 is coupled to PMU 509, CMU 510, and TMU 511 via communication lines 506, 507, and 508, respectively, via a communication bus (not shown).
MCU 505 is configured to individually program and individually, serially or in parallel change each of PMU 509, CMU 510 and TMU 511 as required by test profile 501. The PMU 509 is connected to one or more SCUs 516, 521 via a communication line 512. The CMU 510 is connected to one or more SCUs 516, 521 via a communication line 513. The TMU 511 is connected to one or more SCUs 516, 521 via a communication line 514. This architecture allows the PMU, CMU and TMU cells to be able to program and control one or more SCUs 516, 521, respectively, in parallel.
The SCU 518 is connected in parallel to the DUT 519 and the DUT 520, wherein the SCU 516 is connected to one or more DUTs through a communication bus 518. DUT 519 represents the first of n1 DUTs and DUT 520 represents the last of n1 DUTs controlled by SCU 518.
SCU 521 is connected in parallel to DUT 522 and DUT 523, where SCU 521 is connected to one or more DUTs via communication bus 521. DUT 522 represents the first of n2 DUTs and DUT 523 represents the last DUT of n2 DUT boards controlled by SCU 521. The architecture of the present invention should be interpreted as promoting a high degree of parallelism, since there are many dedicated internal buses between the connections.
Communication lines 515 and 517 provide additional communications as shown.
In a simple example of processing volatile memory, the MCU 505 applies only one of the test profiles 501 to one or more SCUs 516, 521 over the network 503 and performs all required tests using nominal PMU, CMU and TMU settings, where the SCUs 516 and 521 apply the required AC and DC level signals and test patterns to all DUTs 519, 520, 522, 523 and each of the DUTs 519, 520, 522, 523 performs the required tasks. The test results with a single memory address location for each memory die portion, as well as a data error log, may be recorded in the corresponding SCU, and the aggregated log may be provided to the MCU 505 and stored in the network 503 for further analysis. In this example, the MCU 505 may program and test all sockets to test one type of volatile memory at a nominal level. In another example, the same program may program and test the same volatile memory socket, but with a voltage change using PMU 509, or a clock frequency change using CMU 510, or a temperature change using TMU 511, or a combination of one or more variants of PMU 509, CMU 510, and TMU 511 to test the same volatile memory socket using multi-angle testing.
In another embodiment of processing non-volatile memory, the MCU 505 applies only one of the test profiles 501 to one or more SCUs 516, 521 over the network 503 and performs all required tests using nominal PMU, CMU and TMU settings, where the SCUs 516 and 521 apply the required AC and DC level signals and test patterns to all DUTs 519, 520, 522, 523 and each of the DUTs 519, 520, 522, 523 performs the required tasks. The test results with a single memory address location for each memory die portion, as well as a data error log, may be recorded in the corresponding SCU, and the aggregated log may be provided to the MCU 505 and stored in the network 503 for further analysis. In this example, the MCU 505 may program and test all sockets to test one type of non-volatile memory at a nominal level. In another example, the same program may program and test the same non-volatile memory socket, but with voltage changes using PMU 509, or clock frequency changes using CMU 510, or temperature changes using TMU 511, or a combination of one or more variants of PMU 509, CMU 510, and TMU 511 to test the same volatile memory socket using multi-angle testing.
In another embodiment of processing volatile and non-volatile memory, MCU 505 applies only one of test profiles 501 to one or more SCUs 516, 521 over network 503 and performs all required tests using nominal PMU, CMU, and TMU settings, where SCUs 516 and 521 apply the required AC and DC level signals and test patterns to all DUTs 519, 520, 522, 523 and each DUT 519, 520, 522, 523 performs the required tasks. The test results with a single memory address location for each memory die portion, as well as a data error log, may be recorded in the corresponding SCU, and the aggregated log may be provided to the MCU 505 and stored in the network 503 for further analysis. In this example, MCU 505 may program and test some sockets to test one type of volatile memory, and some sockets to test one type of non-volatile memory at a nominal level. In another example, the same procedure may be applied to the same volatile and nonvolatile memory sockets, but with voltage changes using the PMU, or clock frequency changes using the CMU, or temperature changes using the TMU, or a combination of one or more of the PMU, CMU, and TMU to test the same volatile and nonvolatile memory sockets using multi-angle testing.
One or more SCUs are intelligent. In some embodiments, for example, the SCU is configured to run a functional test mode while the DUT is running a host application. In some embodiments, the SCU can even run some or all host applications using the memory of the DUT.
Fig. 6 is a schematic diagram of steps in a method contemplated by the present invention. Step 601, converting memory addresses of volatile and/or non-volatile memory into a matrix, 602 accumulating error data by traversing a plurality of cells of the matrix to test bits of the memory, step 603 post-processing the accumulated error data to determine if the test bits are faulty, 604 repeating steps 602 and 603 for other test bits, 605 evaluating the post-processed error data to identify one or more of the test bits as open or shorted, and step 606 including an exemplary instance (preferably only one) of the memory addresses corresponding to each of the faulty test data bits in a test log.
While certain preferred embodiments and examples have been discussed above, it will be understood that the inventive subject matter extends beyond the specifically disclosed embodiments to other alternative embodiments and/or uses of the invention and obvious modifications and equivalents thereof. The scope of the invention disclosed herein should not be limited by the particular disclosed embodiments. Thus, for example, in any method or process disclosed herein, the acts or operations that make up the method/process may be performed in any suitable order and are not necessarily limited to any particular disclosed order.
Priority
The present application is part of the continuation-in-part of pending U.S. patent application Ser. No. 17/985037, filed 11/10 at 2022.