Structure of the MEMS Micro Speaker

Who needs a membrane?

How we produce sound with tiny air chambers inside a 10 mm² MEMS chip

Did you know that the sound in your earphone is basically generated the same way as in your home sound system? If you are working in our field or are generally a tech-savvy person, this is probably no news. But it’s still remarkable.

The market for microspeakers hasn’t changed much over decades. It is still dominated by electrodynamic drivers, even though they are bulky and power-hungry. But they are also inexpensive, due to highly automated mass production. Therefore, any technology intending to replace them needs to not only meet quality criteria but also be very cost-efficient.

And better solutions are clearly needed. Earbuds are turning into smart “hearables”, with an increasing number of AI functions and thus sensors and edge computing. That requires space and energy. Microspeakers need to become smaller and more energy efficient to free up space and power. And to enable the integration of further electric components, such as amplifiers and a wireless connection.

Silicon is the answer for modern hearables

The good thing is, there is a better solution: silicon. Or more precisely – MEMS (Microelectromechanical Systems). High quality and efficient microspeakers can be produced in a miniature format using widespread MEMS technology. However, there are a few factors to be aware of: the chances for commercial success of a MEMS technology increase when a) standard materials and processes are used, b) the technical performance is convincing and c) economies of scale are possible.

To realize economies of scale and thus reach a competitive price per unit, the size of the single chip is essential. The smaller the individual chip, the more chips can be produced from one silicon wafer.

At Arioso, we have invented a MEMS microspeaker that can generate more than 120 decibels with an active chip area as small as 10 mm². Of course, there is some additional space required for contact pads. But this can be minimised. A typical production number of 25 silicon wafer lots thus allows to produce up to 50,000 microspeaker chips at once!

We cut the membrane into stripes and moved them inside the silicon chip

To get the chip as small as possible, we have done something new: we got rid of the traditional membrane. Simply speaking, we cut it into a multitude of thin stripes and put them into the silicon chip volume. The result is an array of very thin beams or cantilevers, enclosing air. The big advantage of that approach is that the chip has a large acoustic inner surface while the outside surface is kept at an absolute minimum.

We’re often asked to explain how that works. How do you produce sound inside a tiny silicon chip? In short: the beams are moved by electrostatic forces according to an audio signal. Electrostatic micro motors are used for that. They are very popular in MEMS technologies such as microphones, because only two conductive materials (electrodes) and an insulator between them are needed. Hence, the common materials and techniques of microelectronics can be used. That results in cost efficiency of the technology.

Coming back to our silicon chip: the movement of the beams displaces the air between them. The tiny movement is restricted to only a few micrometers – a human’s hair diameter is roughly 75 times larger. That means, each individual beam can only displace a small amount of air. The required sound pressure level is reached by the sheer number of beams in one chip. The air is released through openings in the chips’ bottom and lid. And that’s it: we have sound.

The bottom and lid wafers are bonded to the device wafer. The sound producing beams are moving at a distance of only one micrometer below, respectively above the lid and the bottom. The gap is too narrow to let air pass (which, in these small dimensions, has the fluidic characteristics of honey). The air is only released through the intended chip surface openings.

Customized acoustic design via semiconductor mask layout

The specifications of the microspeaker are determined by the number and the design of the beams and the air chambers between them. Amplitude-frequency response, resonance frequency, harmonic distortion – all can be influenced by the mask layout for the photolithographic production process. Thus, the technology allows to meet individual market needs for a large variety of in-ear devices – from hearables, to smart hearing aids or in-ear monitors.

Energy efficiency needed by modern in-ear devices

Our patented technology does not only fulfil the demand for very small dimensions. It also saves battery, urgently needed in modern hearables and hearing-aids. The small electrode gaps below 2.5 µm allow for driving voltages amplitude well below 15V and electrical capacitance significantly below 1 nF.

With our proprietary ASIC amplifier design and power management we can generate these voltage levels out of normal audio signals and battery supply. We target a system power consumption that stays below 3 mW for normal use-cases.

Since only well-known materials as well as widespread CMOS production capacities are used, and because the speakers are small, easily scalable, and very energy-efficient, they have the potential to gain cost leadership and offer a real alternative to traditional electrodynamic microspeakers. And thus, to enable the big vision of a permanently worn and voice controlled hearable device.

Scroll to Top