A minimal voice laboratory

Getting started inexpensively

For a minimal voice laboratory, I would have a flexible fiberoptic endoscope, camera, stroboscopic light source and a recording device (preferably digital).

I would give up a rigid endoscope and even a stroboscope before giving up a flexible endoscope and a video recorder in the voice lab. Without a recording, viewing the larynx with only the naked eye gives up too much in the way of documentation and too much ability to review what happens so quickly in laryngology. With this simple lab, the very astute observer could even elucidate the pathologic process on video recording without a stroboscope, though it would take a great deal of effort and intuition.

Flexible fiberoptic endoscopy

There are many flexible fiberoptic endoscopes that are relatively affordable by medical standards.  Their flexibility allows a varied functional view of the larynx and a varied perspective (currently very under-utilized by most laryngologists), but detail is lost because of the fibers transmitting the image. With the image in sharp focus, pixilation by the fiberoptics distracts from the image (imagine a view through an insect's eye). The three alternatives for a soft focus are 1) de-focusing the camera, 2) de-focusing the endoscope, and 3) electronic image smoothing algorithms within the camera processor. I find no difference between the three locations for blurring the focus. None-the-less, even with a less than clear image, the key for the laryngologist is to move the endoscope beyond the nasopharynx, beyond the oropharynx, essentially into the larynx -- getting close to the vocal cords. Closeness improves the image. After topical anesthesia, the endoscope can be placed into all the nooks and crevices of the larynx, touching the larynx if necessary without provoking a gag response. 

Closeness improves the image in three ways. Any pathology fills more of the camera lens and since it is a very wide angle lens, even a slight bit closer greatly increases the image size on the screen. With more pixels filled with pathology, the clearer the pathology. Secondly, the closer the camera to the structure, the more light that is placed on the structure and the brighter the image. Third, most processors have an auto-gain feature so that the image always appears as bright as possible. That means, when the endoscope is far away, rather than a dark image, the pathology is seemingly normally lit. This appearance is due to increased gain or electronic magnification. The trade-off for the brightness is more electronic noise. In the endoscopic image, this noise is a lot of additional red dots filling in the image. The higher the gain, the poorer the precision of the image and the more “red” the image appears. 

The examiner with a flexible endoscope who moves the scope closer to the vocal cords overcomes the three deficits to a great degree. The image is clearer, brighter and has less red artifact.


The typical camera is a C-mount camera that can be attached to either the flexible fiberoptic endoscope or the rigid endoscope.

Recording device

Recording directly to a hard drive is a time efficient (if backups are made in a timely fashion) method. Digital hard drive storage certainly saves time when comparing new images to old. (I am becoming history as in the past I have used to have two digital tape recorders [Sony DVCAM] and used to consume more than a minute sometimes queuing up a new and an old exam on two recorders and flipping the monitor back and forth for comparisons).

Decision making in 2019 is now largely between a proprietary recording device and a proprietary recording system. Medical device companies make stand alone units where an exam is recorded to an internal hard drive. The exam may be offloaded to a USB drive for preservation. Typically these units are easy to record exams, but more difficult to review exams.

A proprietary recording system often includes software to record the examination and methods for archiving the files. Then files may be searched and replayed for review or comparison. 

However, in my own voice lab I have chosen a different route. I find standard video software (eg. Apple's Quicktime™, Apple's FinalCut Pro™, Adobe's Premiere™) to be much more efficient for reviewing and manipulating video files. It is also far less expensive than medical capture devices or systems. I plug the video outputs from the endoscopes into my laptop and use software to capture the video. The initial setup requires more of my time. The daily use requires much less of my time.


To learn much from laryngology, the examiner needs a way to retrieve old images. I maintain a spreadsheet with a number of parameters recorded on each exam. The essentials are name, date of birth, date of exam, start and stop times for videotape recording, and some type of categorization and subcategorizing of images. A searchable database of this type allows the examiner not only to compare a patient’s earlier exam to a current exam, but when encountering a new laryngeal finding, it is invaluable to look up some previous exams for comparison or to answer a question.

Light Source

Xenon seems to be the premium light source and halogen is the second brightest. A replacement halogen bulb is in the price range of USD 5.00 and a replacement xenon bulb is in the range of USD 900.00 There are some LED type equipment although I have not found them to be bright enough (yet). In older systems like my KayPentax Stroboscope there are two types of lights and they have a very different color balance. It is then helpful to have a camera that can switch at the touch of a button between two white balance settings. Even with white balance, there will be color differences between images taken by different examiners with different set-ups. It is very difficult to make color comparisons between images obtained with different endoscopes, with different light sources and from different laboratories.

Bright is good as long as there is not so much heat that cables melt or people are burned when touching the light cable.


After the flexible endoscope and a recording device, the next most valuable item is a stroboscope. A stroboscope measures the pitch of the voice and then flashes a light at a slightly lower or higher rate of speed and so a recorded image creates the illusion of slow motion. With the assumption that the vocal cords are vibrating regularly, the recorded image represents the actual motion of the vocal cords, which otherwise cannot be perceived. Typical NTSC video records at about 30 images per second and a male when speaking vibrates the vocal cords about 100 times per second. Thus, with a steady light source, the edges of the vocal folds move through three cycles in every video frame and consequently appear blurred. Different companies use various techniques for shuttering the light source or shuttering the camera to create a stroboscopic image.

Rigid endoscopy

As far as a rigid endoscope, I prefer the 70 degree glass rod more than the 90 degree, for the ease of obtaining an image and for the comfort of the patient in positioning their head during the exam. The 70 degree endoscope can move closer to the larynx than the 90. Closeness improves resolution. Although with the mouth open, there is an alteration and limitation to phonation, because the position of the tongue becomes fixed in the examiner’s hands.

While a flexible endoscope more easily provides a close view, camera chips are larger on rigid endoscopes so images are a higher resolution at a lower cost than flexible chip-on-tip endoscopes.

In the exam room with funds only sufficient for a flexible fiberoptic endoscope and a rigid endoscope, the rigid endoscope offers clarity of view that is complimentary to the flexible fiberoptic endoscope’s close and functional view. The rigid endoscope is poor at assessing much function other than changes in the medial margin of the vocal cords because it is limited to a single perspective - nearly directly above the larynx. There is no other choice to position the scope. The rigid endoscope gives a fairly detailed view of the microvasculature as well. The main functional alteration to be made while viewing is to observe how the vibrations of the medial vocal margin changes with changes in pitch. The images are beautiful for presentations.