As the COVID-19 pandemic continues, workplaces are scrambling to find a usable video conferencing option. The World Health Organization (WHO) is explicitly asking employers to, "Consider whether a face-to-face meeting or event is needed. Could it be replaced by a teleconference or online event?"
Switching all meetings to video conferencing, however, poses a number of security challenges. This post will outline some questions to ask when selecting a conferencing platform, present a sample workflow for selecting a platform which satisfies threat-specific needs, and finally provide a brief overview of some existing conferencing platforms.
Developing a Group Conference Threat Model
The first question to ask when deciding on a platform to host a given meeting is whether audio-video capability is needed at all, or if the meeting can take place solely over text. If audio and video isn’t needed, Signal offers end-to-end encryption and provides the option for one-on-one as well as group chats, in which files can be shared as well. But one downside of Signal is that it requires all participants to share their phone numbers with each other. Another good option is Keybase, which also offers end-to-end encrypted individual and group chats and comes with file-sharing capabilities.
If you decide that audio and video is in fact necessary, the next question to ask is what level of security is needed for the video conference. Today, there are three basic tiers of video conferencing security available to choose from. The appropriate tier is dependent on the situation-specific threat model for a particular meeting.
Tier 1: Third party-hosted conferencing with transport-level encryption
The weakest form of security offered by today's various video conferencing platforms is transport-level encryption with third-party conference hosting. Transport-level encryption means that while the conference is encrypted in transit--an adversary monitoring a WiFi network shouldn’t be able to see or hear the content of the meeting--the service provider itself will be able to see the content of the meeting if they want to, or are compelled to by law enforcement.
The question to ask is whether the service can spy on your meetings is an actual concern for the specific meeting being threat-modelled. Are the topics you’ll be discussing in this meeting so sensitive that it would be bad for the server operators (or anyone they may in turn grant access to, such as government authorities) to observe, or does it not really matter?
If you’re attending an online yoga class, then a Zoom, Hangouts Meet, or Jitsi Meet session may be fine for the given threat model. But if you’re going to be talking about anything sensitive where Zoom or Google could potentially receive and comply with a wiretap request to record the content of the meeting, then a higher security tier is desirable.
Tier 2: Self-hosted conferencing with transport-level encryption
If the video conferencing service spying on your meetings is in your threat model,, the next security tier calls for the use of a conferencing platform which allows for self-hosting, such as Jitsi Meet. Some services, like Zoom and Vidyo, allow customers to host media servers on-premises.
These options still provide only transport-level security, but if you self-host it, a third-party service is no longer able to spy on your meetings. If the server is hosted using a cloud hosting provider like Amazon AWS, however, it is still possible that the hosting provider itself may receive a wiretapping request.
Self-hosting services also requires considerably more internal resources than using third-party hosted options. You’ll need the in-house expertise to configure, secure, maintain, and update your video conferencing server.
Tier 3: End-to-end encrypted conferencing
For meetings requiring the highest level of security, where it is essential that the risk of compromise is kept to a minimum, you should use a platform that supports end-to-end encryption (E2EE). Unlike transport-level security, which encrypts messages only in transit but allows the server to see them, E2EE means that the meeting content may only be viewed by the meeting participants (the 'ends'), not any intermediaries. In other words, in E2EE-based video conferencing options, the server (if it exists at all) is not able to spy on the meeting (however, it may be able to still see the metadata about the conference, such as names, emails, and IP addresses of the participants). Many of the conferencing platforms also provide phone numbers for participants to dial into -- these phone connections are generally not encrypted at all.
If the answer to the previous question of is the topic of the meeting sensitive enough that we would not want a third-party service or state actors to overhear is a resounding 'yes', then conferencing platforms which offer E2EE should be used. Unfortunately, there are currently not too many scalable E2EE video conferencing options available.
Cisco's Webex offers support for up to 200 simultaneous users via their Business plan, which requires a minimum of five monthly licenses. Aside from the E2EE capabilities, they also offer the ability to host the conference servers on-premise, which makes Webex the most viable enterprise-level video conference platform with E2EE support and self-hosting.
Apple's FaceTime offers a participant limit capped at 32 users, but all participants would have to have Apple devices.
Google's Duo works on both iOS and Android devices, but is limited to a maximum of 12 participants. While both claim to be E2EE, FaceTime and Duo are closed source applications (Zoom, too, claimed to be E2EE though that was shown to be untrue).
While both FaceTime and Duo are freely available, they do not offer features such as meeting links or scheduling, instead working instantaneously with the host having to coordinate the conference call in real time.
Meanwhile, Jami (formerly known as GNU Ring, which was in turn formerly known as SFLphone) works on the widest array of platforms, also advertises itself as being E2EE, is open source, and the number of conference participants is limited by the host's available bandwidth. But in our tests, Jami proved to be a touch persnickety, routinely crashing or malfunctioning when run on Mac or Ubuntu machines, though it was somewhat more responsive when run on mobile iOS and Android platforms.
Questions for appropriate tier selection
When deciding on which conference platform is appropriate for a specific meeting, sketch out a basic threat model for the given meeting by posing a number of questions:
- Does the meeting require audio/video or can it take place over text?
- Is the sensitivity of the meeting such that it requires end-to-end encryption or is transport-level encryption sufficient?
- Do you have the capacity to host your own server, or will you be reliant on a third-party host?
A selection guide to secure group conferencing options
Here is a handy flowchart you can use when trying to determine which group conferencing solution to use.
Index of Referenced Group Conferencing Platforms
While information about the following platforms was accurate at the time of publication, it may quickly become outdated. If you’re looking at this in the future, you may have to do additional research.
Google's Duo provides end-to-end encryption (E2EE) and runs on both iOS and Android devices. However, the group conferencing feature does not work on desktops, and the total number of participants per group session is limited to 12. Google servers may also see certain "info about the call", though Google does not specify what this particular information is.
Apple's FaceTime provides E2EE, however it is only available for Apple Devices and the total number of participants per group session is limited to 32. Apple keeps a record of some metadata "such as who was invited to a call, and your device’s network configurations, and store this information for up to 30 days". Apple also publishes a transparency report (alongside Google and Signal, they are the only ones to do so of all the organizations mentioned here).
Google's Meet provides only transport-level encryption. Meanwhile, in their Hangouts Help section titled 'How Hangouts secures information', Google does not actually explain how Hangouts secures information at all, instead only restating the section heading to say that "When you message or talk with someone on Hangouts, your information is secured". Hangouts/Meet do offer support across various platforms (including Windows, Mac, and Linux, as well as mobile support). Google Hangouts has support for video conferencing for up to 10 people for free, while Google Meet offers supports for up to 250 people, but requires a paid G Suite subscription (the basic G Suite subscription typically supports only up to 100 users, but currently supports up to 250 people, through the end of September 2020).
In a now-deleted question in their Requests for User Information FAQs (formerly titled 'Legal process for user data requests FAQs'), Google had previously explained that a "wiretap order requires a company to hand over information that includes the content of communications in real-time". Though the number of wiretap requests Google has disclosed it has received is relatively low (for example, there were three requests from US authorities between July - December 2018), it did comply with all the wiretap requests, signifying that it is possible for Google to facilitate real-time communications monitoring (though it is unclear which specific Google services were subject to the wiretap requests). Alongside Apple and Signal, Google is the only organization of all others mentioned here which provides a transparency report.
Jami (formerly known as GNU Ring, and as SFLPhone before that) is an open source platform which provides E2EE (though there are questions about its actual inner workings) and operates via a distributed peer to peer model which doesn't require a centralized server. In regards to video conferencing, Jami have noted that "the maximum number of participants depends on the hosting device’s computing power and available bandwidth. We have tested with up to sixteen members but it could potentially go higher". In our tests, which involved running Jami on MacOS, Ubuntu, Arch, Android, and iOS, Jami would crash sporadically on the Mac client, and was not able to initiate one on one, let alone group, video calls between Mac and Ubuntu machines. Jami functioned somewhat better on mobile OSes, but a video conference with only four users quickly became unstable and eventually froze.
Jitsi meet offers the ability to either use Jitsi's own server or to host your own server to facilitate group video conferencing. Though various managed service providers such as Greenhost may also assist in setting up a third-party Jitsi server. Jitsi's own server (meet.jit.si), unlike the Jitsi Meet code itself, is "not opensourced and is proprietary". Jitsi claims to provide "secure video conferencing solutions", though goes on to explain that it does not currently provide E2EE. Though Jitsi is currently working on enabling E2EE for conferences, but the feature is not yet fully ready. The maximum number of concurrent conference participants is 75, though Jitsi notes that with "more than 35, the experience will suffer". Jitsi also claims that they have received a security audit, however it is "company internal" and does not appear to be publicly-viewable.
Keybase provides E2EE for either one-on-one or group text chats, but like Signal it does not offer support for audio-video conferencing. Unlike Signal, Keybase does not require a phone number to work. Aside from Signal, Keybase is also the only tool mentioned here which has received a public security audit.
Signal provides E2EE using the Signal Protocol, however while it provides the capabilities for group text conferences, it only allows for one-on-one voice and video calls. Signal requires all participating users to have a phone number that is visible to all members of a group chat. Signal has received an independent security audit (aside from Keybase, Signal is the only organization mentioned here which has received a public security audit), and also provides a transparency report for government requests for information (aside from Apple and Google, Signal is the only organization mentioned here which provides a transparency report).
Vidyo provides paid subscription audio-video group conferencing with transport-level encryption, while also advertising "encrypted endpoint management, signaling, and media for end-to-end security for your entire VidyoConferencing system", though it is unclear if this refers to E2EE for users or only 'E2EE' between various Vidyo components, with the option only existing to self-host said components.
Cisco's Webex provides support for a maximum of 200 users on their Business plan (though other plans are less expensive they allow for fewer users). Webex allows for on-premise hosting as well as the option to enable E2EE so the Webex server does not see encryption keys, which all appears to make Webex the most viable enterprise-level solution for those who need E2EE video conferencing.
Zoom provides paid subscription audio-video group conferencing with transport-level encryption (despite previous misleading marketing which falsely claimed it provided E2EE), however it has recently been plagued with various security and privacy concerns. Zoom does offer the option to self-host, though conference metadata still travels through Zoom's own servers. While not currently having one, Zoom has said that they will be preparing a transparency report over the next few months, as well as stating that they are working on functionality to let users generate encryption keys themselves instead of relying on Zoom's servers for key generation.
Questions for conducting due diligence in platform selection
Aside from the platforms mentioned here, there are quite a few other existing group conferencing platforms, not to mention new options popping up all the time. Before selecting a particular platform, perform due diligence by checking for such factors as:
- Does the platform provide end-to-end encryption (E2EE) or just transport-level encryption?
- Does the platform allow for self-hosting?
- If it’s a third-party service, which jurisdiction are the servers located in?
- How do the jurisdictions the servers are located in handle data requests from government authorities?
- What is the maximum number of concurrent participants the platform supports per group meeting?
- Has the platform undergone an independent security audit?
- What is the last time such an audit has been performed?
- Was the audit done by an independent firm, or by someone with ties to the platform?
- Are there any existing known but unpatched security vulnerabilities in the platform?
- How prompt and responsive has the platform been in addressing previous vulnerabilities or security concerns?