Video Conferencing Room Design
Video conferencing quality depends far more on room design than on equipment selection alone. A $3,000 camera in a poorly designed room produces worse results than a $300 camera in a well-designed space. For AV integrators, mastering conferencing room design means delivering consistent, professional experiences that end-users trust.
Why Room Design Matters
The best unified communications (UCC) platforms and meeting technology room (MTR) systems fail when deployed in spaces with poor sightlines, inadequate lighting, acoustic echo, undersized displays, or network congestion. Conversely, a modest camera and display in a carefully designed space delivers clarity, professionalism, and engagement that keeps participants focused on content and conversation, not technical frustrations.
Display Sizing
Display size directly affects remote participant visibility and engagement. Too small and remote faces are indistinct; too large and eye contact becomes awkward. The ANSI/AVIXA DISCAS (Display Industry Standards and Consulting Association Standards) formula guides selection.
DISCAS Display Size Formula The viewing distance formula calculates minimum display diagonal:
- Minimum screen diagonal (inches) = Viewing distance (feet) × 12 / Display viewing distance ratio
- For video conferencing, target a ratio of 1/8 (meaning the screen diagonal is 1/8 the viewing distance for natural presence)
- Example: In a 12-foot conference room, minimum display = 12 × 12 / 8 = 18 inches diagonal
For typical conference rooms:
- Small huddle space (8–12 feet): 32–40 inch display
- Medium conference room (12–16 feet): 50–65 inch display
- Large conference room (16–24 feet): 75–85 inch display
- Boardroom/Training (20–30 feet): 85–110 inch display
Single vs Dual Display Layouts
- Single display: Simpler wiring and control, natural focal point. Works well when content and people are equally important (equal screen real estate).
- Dual display (side-by-side or Picture-by-Picture layouts): Optimal when content and participants both need full visibility. One display shows remote participants (full height), the other shows shared content (side-by-side, equal priority). Supports gallery view on one screen and content on the other.
Display Height for Remote Participant Eye Contact Mount displays so remote participants' eyes are at or slightly above the camera operator's eye level (approximately 48–60 inches from floor to center of display). This creates natural eye contact between local and remote participants. A display mounted too high forces the camera operator to look up, appearing submissive; mounted too low forces an unnatural downward gaze.
Camera Placement and Field of View
Camera placement and framing set the tone for how participants are perceived by remote attendees.
Horizontal Field of View (FOV) Requirements Use the DISCAS formula to determine camera FOV based on room width and desired framing:
- Target: Horizontal FOV sufficient to frame all local participants with some headroom
- Typical range: 60–75° horizontal FOV for small to medium conference rooms
- Calculation: Field of view angle ≈ 2 × arctan(room width / (2 × camera distance from wall))
- Example: In a 12-foot-wide room with the camera 2 feet from the front wall, required FOV ≈ 75°
Too wide FOV distorts faces (ultra-wide angle); too narrow FOV misses participants. Avoid mounting a narrow-FOV camera (30°) in a room wider than the camera can frame—remote participants see only partial framing or repeated panning.
Camera Height and Framing Mount cameras at eye level (approximately 48–60 inches from floor) so the camera lens points directly at participants' faces when they sit. This creates natural eye contact between local and remote participants. The camera should frame participants from mid-chest upward, with slight headroom above.
- Too high: Camera looks down at participants' foreheads; remote view appears disrespectful or submissive.
- Too low: Camera looks up at participants' nostrils; unflattering and unprofessional.
- Optimal: Lens height = average seated eye level; camera points straight ahead at participants.
Avoiding Backlit Placement Never position a camera such that windows or bright light sources are behind participants (backlit setup). The camera's exposure will meter off the bright background, leaving participants' faces dark and featureless. Position cameras and seating so:
- Windows are to the side or in front of participants (providing key light)
- Bright light sources are not directly behind seated participants
- If windows are behind the room, install window coverings (blackout shades or diffusion) to balance exposure
Integrated vs Discrete Cameras
- Integrated cameras (built into displays or conference room endpoints): Convenient, minimal visible hardware. However, camera position is fixed to the display mounting height; if the display is too high or too low, camera framing suffers.
- Discrete cameras (separate unit, mounted independently): More flexibility for optimal positioning. Can be mounted at eye level even if the display is mounted higher for visibility. Best practice for professional conference rooms.
Microphone Coverage Design
Microphone design determines how clearly local participants' voices are transmitted to remote attendees. Poor microphone design causes remote participants to hear only fragments of local conversation or, worse, ask "Can you repeat that?" repeatedly.
Microphone Types and Coverage Zones Different microphone types cover different areas:
- Boundary condensers (tabletop): ~8-foot diameter coverage, omnidirectional, sit passively on the table. Good for small boardrooms. See microphone-types for details.
- Ceiling-mounted condenser arrays: ~15–20-foot diameter coverage, cardioid or supercardioid, ceiling-mounted. Ideal for medium to large conference rooms.
- Wireless lavaliers: ~2–3-foot intimate coverage, handheld or body-pack worn by presenter.
- Headset/boom mics: ~6-inch intimate coverage, mouth-mounted.
Table vs Ceiling vs Wall Microphone Placement
- Table placement: Puts the mic at talker mouth level, maximizing direct sound capture relative to room noise. Ideal for boardrooms where participants sit around a table. Requires boundary mics (omnidirectional or cardioid) designed to sit on hard surfaces.
- Ceiling placement: Distributes coverage over a large area with minimal visible hardware. Requires beamforming or array technology to isolate talkers and reject room noise. Common in larger conference rooms.
- Wall placement: Useful when table mounting isn't practical (e.g., auditorium-style seating). Mount arrays at the speaker's head level (or above) for good angle to talker mouth.
Microphone-to-Speaker Distance Active speakers must be within the microphone's coverage zone to be heard clearly by remote participants. A seated participant 12 feet from the nearest microphone will sound distant and weak. Design for:
- Tabletop mics: All participants within 8 feet of the mic
- Ceiling arrays: All participants within 15 feet of the array
- Handheld/wireless: Presenter holds/wears the mic, ensuring close distance
Microphone-to-Speaker Distance for Acoustic Echo Cancellation (AEC) Modern conferencing platforms use software-based AEC to subtract speaker output from the microphone input, preventing echo. However, AEC assumes a known delay between speaker and microphone. For optimal AEC performance:
- Position microphone and speaker on the same side of the room (not opposite ends)
- Distance between mic and speaker should be consistent and known
- Ideal mic-to-speaker distance: 4–8 feet
- If mic and speaker are >15 feet apart, or if room geometry makes distance variable, AEC may fail and create persistent echo
Speaker Placement in Conference Rooms
Speakers reproduce both local audio (ringtones, notifications) and remote participant voices. Poor speaker placement creates echo, feedback, or unintelligible audio.
Front of Room vs Ceiling Distributed Speakers
- Front-of-room: Single large speaker (or pair) mounted near the display, pointing toward seating. Provides strong direct sound and clear acoustical image (participants' voices appear to come from the display, where remote faces are visible). Requires higher SPL but simplifies delay alignment. See speaker-placement for geometry details.
- Ceiling distributed: Multiple smaller speakers distributed throughout the room. Reduces max SPL per speaker, spreads acoustic loading. Requires careful delay alignment to maintain image coherence; risk of echo if delays are incorrect.
For small to medium conference rooms (up to 200 sq ft), front-of-room is usually optimal. For large rooms (200+ sq ft), a combination of front speakers plus rear-fill or distributed ceiling speakers may be necessary.
Speech Intelligibility Requirements Remote participant voices must be intelligible to local participants. Measure speech intelligibility (STI—Speech Transmission Index) or Clarity Index (C50) at candidate listening positions. Target:
- STI > 0.6 (acceptable for conference spaces)
- STI > 0.75 (good for professional conferencing)
Poor intelligibility symptoms: "Can you repeat that?" from local participants, inability to distinguish speaker voices, muffled or boomy-sounding audio. Causes: inadequate speaker SPL, excessive room reverberation, poor speaker-to-microphone distance for AEC, or speaker placement that creates comb filtering.
Avoiding Speaker-to-Microphone Feedback Paths When local speaker output is picked up by the microphone and sent back to the speaker (positive feedback loop), echo and howl occur. Prevent this by:
- Positioning microphone and speaker on the same side of the room
- Using directional microphones that reject sound from speaker direction
- Implementing proper AEC (which requires stable mic-to-speaker distance)
- Using a speaker that doesn't aim directly at the microphone
- Testing for feedback by raising system volume slowly; feedback should not occur at normal operating levels
Acoustic Requirements
Room acoustics determine voice clarity and absence of distracting echo.
RT60 (Reverberation Time) RT60 is the time required for sound to decay by 60 dB. Lower RT60 means sound decays quickly; higher RT60 means longer echo and reverberation.
Target RT60 for Video Conferencing: 0.3–0.5 seconds
- Below 0.3 seconds: Room sounds dead and acoustic; speech may sound crisp but unnatural.
- 0.3–0.5 seconds: Optimal for speech intelligibility; room sounds lively but controlled.
- Above 0.5 seconds: Echo and reverberation degrade remote participant intelligibility; AEC may fail.
How to Measure RT60 Use an RTA (real-time analyzer) app and a calibrated microphone. Generate pink noise or a swept sine wave, then stop the stimulus and measure decay time. Measure at multiple frequencies (250 Hz, 500 Hz, 1 kHz, 2 kHz) because RT60 varies with frequency.
Noise Floor Requirements Remote participants should not hear background noise (HVAC, traffic, adjacent offices) in the conference room. Target noise floor:
- NC-30 or better (approximately 35–45 dB-A)
ASHRAE HVAC Noise Specification HVAC systems are major background noise sources. Specify HVAC ductwork and equipment meeting ASHRAE NC-30 or better:
- Low-velocity ductwork (≤1500 fpm)
- Lined ducts to absorb noise
- Vibration isolation for fans and compressors
- No exposed dampers or control valves that create hiss
Background Noise Sources to Control
- Keyboard/mouse noise: Provide soft-touch peripherals or enforce keyboard discipline
- Paper shuffling: Warn participants before meetings or minimize paper use
- Mobile notifications: Silence phones before meetings
- Adjacent spaces: Use sound-absorbing partitions or schedule meetings away from noisy adjacent areas
Network Requirements
Video conferencing demands consistent, high-quality network connectivity. Inadequate network design results in pixelated video, frozen frames, one-way audio, or dropped calls.
Bandwidth Per Platform Typical bandwidth requirements (downstream/upstream, both directions):
- Zoom 1080p (30 fps): 3.0/2.5 Mbps per participant
- Microsoft Teams HD (30 fps): 2.5/1.5 Mbps per participant
- Cisco Webex HD (30 fps): 2.5/1.5 Mbps per participant
- Multiplied by participant count: A 10-person meeting with all on video = 10 × 2.5 = 25 Mbps downstream
Margin and Peak Load Allocate headroom above sustained bandwidth requirements. Conference rooms often have bursts (screen share, gallery view switching). Provision 150% of calculated bandwidth to handle peaks and interference from other network users.
QoS (Quality of Service) for Video Conferencing Configure DSCP (DiffServ Code Point) tags and QoS policies to prioritize video conferencing traffic:
- AF41 (video): 0x88 — Prioritize video frames
- AF31 (audio): 0x68 — Prioritize audio (slightly lower than video)
- CS5 (real-time interactive): 0xA8 — Highest priority for interactive applications
- CS0 (best effort): 0x00 — Default for other traffic
Most modern conferencing endpoints and firewalls support DSCP tagging. Configure your QoS policy to honor these tags.
Latency Targets
- One-way latency < 150 ms: Acceptable for conferencing (ITU-T G.114)
- One-way latency 150–400 ms: Noticeable delay; conversations become awkward
- One-way latency > 400 ms: Unacceptable; participants talk over each other
VLAN Configuration for Conferencing Endpoints Isolate conferencing traffic from general data traffic using VLANs:
- Create a dedicated VLAN for conferencing endpoints (cameras, displays, codecs)
- Configure QoS policies for that VLAN (AF41 for video, AF31 for audio)
- Restrict access to conferencing endpoints to authorized users/devices
- Monitor bandwidth usage and latency on the conferencing VLAN
- See vlan-configuration-for-av for detailed configuration guidance
BYOD (Bring Your Own Device) Connectivity Enable local participants to join from laptops or mobile devices. Provide:
- Guest WiFi network with sufficient bandwidth and low latency
- Seamless provisioning (QR code, NFC, or app-based joining)
- Screen sharing from participant devices to the conference room display
- Audio routing so remote participants hear from room speakers (not laptop speakers)
Room Size Categories and Design Approaches
Different room sizes demand different design strategies.
Huddle Space (1–4 people)
Typical uses: Quick sync meetings, ad-hoc video calls, two-person interviews.
Design approach:
- Single 40–50 inch display mounted at eye level
- Single boundary condenser on table or ceiling-mounted mini array
- Single front-of-room speaker (integrated into display or separate)
- Simple endpoint (codec) with integrated mic/camera
- Minimum reverberation (soft furnishings, area rug, acoustic panels)
- Network: 4–8 Mbps sustained, 12 Mbps peak
Key consideration: Intimate space where every participant is close to the camera. Ensure camera FOV includes all participants without distortion.
Small Conference Room (4–8 people)
Typical uses: Team meetings, department syncs, client check-ins.
Design approach:
- Single 55–65 inch display, mounted at eye level or on a swivel arm for flexibility
- Two to three boundary condensers on table, or ceiling-mounted cardioid array
- Two front-of-room speakers (stereo pair for ambient sound)
- Professional endpoint with external camera (discrete, eye-level mounted) and array mic
- Acoustic treatment: wall panels and ceiling absorption to control RT60 to 0.4–0.5s
- Network: 8–15 Mbps sustained, 25 Mbps peak
Key consideration: Multiple participants around a table. Use boundary mics to capture all voices without anyone needing to raise their hand to be heard.
Medium Conference Room (8–14 people)
Typical uses: All-hands meetings, client presentations, training sessions.
Design approach:
- 65–85 inch display (single or dual for content + participants)
- Ceiling-mounted beamforming mic array or combination of wireless lavalier + table boundary mic
- Three to four front-of-room speakers or distributed ceiling speakers
- Professional codec with Ethernet connection (critical for stable video/audio)
- Acoustic treatment: 40–60% surface absorption, target RT60 0.35–0.45s
- Network: 15–30 Mbps sustained, 50 Mbps peak
Key consideration: Larger group means participants sit farther from mics. Beamforming arrays (e.g., Shure Microflex Advance, Polycom Acceler Studio) or wireless handheld/lavalier can help isolate speaker voices.
Large Conference Room (14–20 people)
Typical uses: Town halls, large client meetings, training sessions.
Design approach:
- 85–110 inch display (often dual: participants on one, content on the other)
- Ceiling-mounted beamforming array plus wireless handheld for presenter
- Four to six strategically placed speakers (front + distributed or line array)
- Professional codec with full-duplex audio processing and AEC
- Robust acoustic treatment: target RT60 < 0.4s, sound-absorbing wall panels, ceiling tiles, carpeting
- Network: 30–50 Mbps sustained, 80 Mbps peak
Key consideration: Participants at far ends of the room are difficult to hear and see clearly. Wireless microphone for the main presenter ensures they're always captured at close distance.
Boardroom/Training Room (20+ people)
Typical uses: Executive meetings, large training sessions, multi-site conferences.
Design approach:
- Dual 85–110 inch displays (content + participants) or a single large display + projection screen
- Multiple wireless microphones (2–3 handheld + lapel mics for panelists)
- Ceiling-mounted array mic(s) for ambient capture
- Six to eight speakers: distributed ceiling or line array for even coverage
- Professional AV control system (Extron, Crestron) to manage complex switching, scaling, and routing
- Full acoustic treatment: wall absorption, diffusion, bass traps for low-end control
- Network: 50–100+ Mbps, dedicated bandwidth for conferencing
- Backup connectivity (dual WAN, cellular) for mission-critical meetings
Key consideration: Scale and complexity require careful integration planning, extensive testing, and professional installation. Professional AV control system is essential.
Common Pitfalls
Windows Behind Participants Causing Silhouette If participants sit between the camera and bright windows, the camera meters exposure off the bright background, leaving participants' faces dark and featureless. Remote attendees see only silhouettes. Solution: position seating so windows are to the side or in front of participants; use window coverings (blackout shades, diffusion fabric) to balance exposure; or install key lights (soft diffused lamps) at participant face level to counter-illuminate.
Camera Too High Making Participants Look Up Mounting the camera above eye level (e.g., high on the wall or ceiling) forces participants to look up at the camera. Remote attendees see participants' noses and foreheads—unflattering and unprofessional. Ideal camera height: 48–60 inches from floor (seated eye level). Mount cameras at this height even if the display is mounted higher.
Single Display Forcing Choice Between Content and People A single 55-inch display in a 16-foot room can show either remote participants (gallery view) or shared content (screen share), but not both clearly. Participants miss facial cues while viewing content; remote attendees don't see local participants' reactions to shared material. Solution: dual-display setup with participants on one display, content on the other. Or use a larger display (75–85 inches) and use picture-by-picture mode (participants on one side, content on the other).
Insufficient Acoustic Treatment Causing Echo Hard surfaces (glass, drywall, concrete) reflect sound, causing reverberation and echo. Remote participants hear their own voice echo back (disturbing) and local participants' voices distorted. Solution: apply acoustic treatment to at least 40% of room surfaces—wall absorption panels, acoustic ceiling tiles, area rug, soft furnishings (couches, chairs). Target RT60 of 0.3–0.5 seconds. Measure with RTA software.
Undersized Network Bandwidth If network bandwidth is insufficient for video conferencing, the platform's adaptive bitrate codec reduces video quality (pixelation, freeze frames) or disables video (audio-only). Local participants perceive remote attendees as looking unprofessional, and engagement drops. Solution: measure actual bandwidth requirements, provision 150% headroom, implement QoS for conferencing traffic, and monitor bandwidth usage. See qos-for-audio and vlan-configuration-for-av for detailed guidance.
No BYOD Connectivity Planned If local participants cannot easily join from laptops or mobile devices, ad-hoc participants are left out or forced to use the room's fixed codec (inconvenient). Solution: provide a guest WiFi network with sufficient bandwidth (provisioned separately from corporate network if necessary), enable screen sharing from participant devices, and ensure audio routing is seamless (remote participants heard on room speakers, not participant laptops).