Build WebRTC desktop apps with OpenTok and Electron

October 4, 2017, 5:44 am

≫ Next: How to add a video chat embed to your Wix website

≪ Previous: Introducing our brand new Developer Evangelist!

TokBox is pleased to announce that with the release of OpenTok.js v2.12, OpenTok now officially supports Electron, allowing developers to build hybrid desktop applications which combine native APIs and web development to create native-like applications.

The Electron Framework

Electron is a popular open-source framework built on top of Chromium and Node.js and is backed by Github. It allows you to create cross-platform applications using JavaScript, HTML, and CSS. Companies like Slack have developed their Mac desktop applications using Electron and also contribute heavily to the Electron open-source ecosystem. As part of our ongoing program to extend OpenTok support to as wide a range of frameworks as possible, we added support for Electron to enable developers to implement real-time communication in their Mac, Linux, or Windows applications.

Who is Electron For?

If you’re a web developer who wants to build a desktop application then Electron is for you because you can get started within minutes without having to learn any other language. However, if you’re a Windows developer then it may be more natural to use the OpenTok Windows SDK and work in C#.

OpenTok and Electron

Building an OpenTok Electron application is almost the same as building a JS web application. However, one of the key points in using OpenTok with Electron is keeping Chromium up to date to ensure that our OpenTok.js SDK fully supports video chatting and features like screensharing. One of the advantages of using OpenTok with Electron is that you don’t need to create a screensharing extension to allow your end-users to share their screen.

To start building the OpenTok Electron sample application, clone the OpenTok Web Samples Github repository.

git clone git@github.com:opentok/opentok-web-samples.git

You’ll notice that the OpenTok sample code for adding a video chat is almost identical to the web sample app code, but with some slight changes such as adding Electron to the application.

Now, install both the OpenTok and Electron libraries by running `npm install` in your terminal.

To connect to a session, publish a stream, and subscribe to a stream, grab your TokBox API Key, Token, and Session from your TokBox account. If you don’t already have an account, you can sign up here.

// Set Credentials
const apiKey = "";
const sessionId = "";
const token = "";

To see the app in action, start the application by running `npm start`

Congratulations! You’ve now created a desktop application with real-time communication. Below, you can see me, top picture, discussing Electron support with one of our server engineers, Madhav, using the OpenTok Electron sample application:

Electron for OpenTok sample app

What’s Next?

At TokBox, we continue to innovate to enable developers to use OpenTok regardless of the platform or framework they’re developing on. As a result, in addition to adding support for hybrid frameworks such as Electron, we are developing a client SDK for macOS. For access to an early version, contact us at sdk-beta@tokbox.com.

If you’re interested in implementing your own messaging and social apps like Slack and Snapchat, check out the following posts:

The post Build WebRTC desktop apps with OpenTok and Electron appeared first on TokBox Blog.

↧

How to add a video chat embed to your Wix website

October 12, 2017, 11:11 am

≫ Next: Build a live video app with ARKit and OpenTok

≪ Previous: Build WebRTC desktop apps with OpenTok and Electron

Recently, TokBox introduced Video Chat Embeds enabling you to add real-time communication to your website with just a few lines of code. Video embeds can be easily integrated into websites built on any of the following platforms:

Wix
Squarespace
Zoho Sites
Weebly (Business and above)
WordPress hosted account (paid)

Who are Video Chat Embeds for?

If you’re an educator, doctor, support agent, blogger or someone who wants to add a 1:1 video chatting to your website then embeds are for you!

Using Video Chat Embeds

Let’s go over adding a video chat on our Wix website together where people can give 1:1 live video music lessons. Although we’re doing this for our music website, this can be applied to doctor-patient, student-teacher, interviewer-interviewee or any other 1-to-1 live video application for the web.

You can access the video embed for our music Wix site here: https://manik669.wixsite.com/musiclessons/live-lessons

Before we begin, please make sure to have your OpenTok Video Chat Embeds code handy. If you don’t have an account already, sign up here. For more information on how to create an embed project, click here. Be sure to whitelist your domain so your video chat embeds can be verified. You can either choose the iframe or script embeds, but please note that iframe embeds are not supported on Safari because Safari doesn’t allow iframes to access a user’s camera or microphone.

Adding Code to Wix

Log into Wix.
Click “Manage Site”.
Click “Edit Site”.
Select the page where you want to add the video chat
On the widget on your left, click on the “+(Add)” icon.
At the bottom, click “More”.
Under “HTML & Flash”, click “HTML Code”.
- This will create a small widget on your site editor.
- Place this widget where you’d like your video chat to be on your page
Click “Edit Code” button on the widget and add select the “Code” option.
Add your OpenTok Video Chat Embed code in the text box below and click “Update”.
After adding your code, click “Publish” on the top right corner to deploy your changes live.

An example of the video chat embed and the HTML Settings in edit mode will look like this:

Add video chat to your Wix website

You may notice that that everyone is in the same room, however, if you change the roomID in the embed code, you can dynamically generate rooms to create private channels. At the end of the embed URL, you’ll see a room parameter which is set to DEFAULT. If you want to create a new room, simply change the room parameter to another value. For example, you can create an embed with room=lesson1 and another one with room=lesson2. This will ensure that any participants in the “lesson1” room will only be able to communicate with other participants in “lesson1”. For more information on dynamic rooms for embeds, please visit: https://tokbox.com/developer/embeds/#rooms

Congratulations! You’ve now added a video chat to your WIX website! Take a look at the example below where I’m getting a guitar lesson from, Liz, a member of our design team who also happens to enjoy playing the guitar:

Live music lesson with video chat embeds

What’s Next?

As you can see, adding a video chat to a website has never been easier. At TokBox, we’re extremely passionate about enabling our users so they can use OpenTok regardless of the platform or framework they’re developing on. If you’re interested in creating your own appointment application that uses embeds to power a video chat, please take a look at our video chat embed demo app as it demonstrates the use of dynamic embed rooms for appointments between two people. Alternatively, if you’d like to have full control over the functionality and UI, screensharing, archiving, session monitoring, multi-platform support, or have a large multi-party video chatting application then you should use the OpenTok API.

The post How to add a video chat embed to your Wix website appeared first on TokBox Blog.

↧

Build a live video app with ARKit and OpenTok

October 17, 2017, 2:41 pm

≫ Next: How to use Android Picture-in-Picture mode with OpenTok

≪ Previous: How to add a video chat embed to your Wix website

If you have an iPhone, chances are that you have already upgraded to the latest version of iOS. In its 11th version, Apple has introduced many new things. As usual, some of them are related to the new hardware, others improve and polish the well-known iOS formula. But there is one thing that is completely new and will bring a new type of applications that never existed before at this scale. We are talking about Augmented Reality (or AR) applications, and the Apple SDK ARKit.

An intro to AR

AR applications simulate virtual elements existing in our world. That’s achieved in the phone by overlaying those virtual elements over a view which is displaying the video stream of our camera. At the same time, it uses the phone’s sensors like the accelerometer or core motion data to track our movements, so when we move around the virtual object, it is rotated like a real object living in the same place would do.

These kind of applications have existed since long before Apple introduced its own SDK. Many of them are based on the well-known OpenCV library, whose great capabilities to do live image processing make AR apps possible. However, ARKit will be a strong push for this kind of apps, since the right hardware in the form of dual cameras is now available and so the user can achieve greater accuracy of AR objects in the real world. It is also well integrated with the iOS developer tools and other SDKs like SpriteKit and SceneKit.

OpenTok and ARKit

If you’ve seen the recent post from our CTO Badri Rajasekar about his thoughts and predictions around AI, you’ll know that here at TokBox, we think there is a whole host of opportunities to be had by combining the power of AR with cameras and microphones. So when Apple unveiled the ARKit back in June at the WWDC we immediately thought how cool it could be to put the video of the participants of a conference in a virtual room combining it with virtual elements. Using that inspiration as a starting point, we started investigating the new SDK and how we could integrate it with the OpenTok SDK.

In this blog post we will describe how you can use ARKit to show the video of your session participants.

All the code of this blog comes from a ARKit sample that you can find here.

ARKit primer

ARKit core is its ARSession class. Whenever you start a session, it will start capturing images from your camera and reading data from your phone sensors in order to perform the calculations to show your virtual elements as if they were real.

In many cases, you will want to place virtual objects over the ground or “hanging” on the walls. For that purpose, ARSession class also has capabilities to inform you when a flat surface has been detected.

Before jumping to the image rendering part, we need to know that ARSession class also provides a way to add elements in a given position via the class ARAnchor. For example, if you want to place a virtual Sphere in the AR world, you will use an ARAnchor to position it. As described in the last paragraph, when a flat surface is detected, ARSession will provide you an ARAnchor in case you want to place an object at that position.

Creating Virtual Elements

We have been talking about placing virtual elements, but we haven’t talked about how we are going to create those elements. ARSession class and the rest of AR* classes will help us with world tracking, but we need a way to draw the virtual objects.

For this purpose ARKit design is open so it can be used in conjunction with many 3D and 2D rendering engines, such as Unity or Unreal Engine. However, one of the easiest ways to combine ARKit is to use Apple’s own frameworks to render content: SpriteKit or SceneKit depending if you want to show 2D or 3D elements.

If you prefer to follow the Apple way, like we did for our sample, ARKit SDK offers you two classes, ARSCNView and ARSKView. In both, you will have a typical SCNView or SKView where you can render sprites or meshes, and you will have an ARSession ready to move and place them as if they were real. In our sample, we decided to use SCNKit, to show the video of the people of a conference as if they were in a real frame, so we will continue exploring SCNKit and ARSCNView.

ARSession and ARSessionConfiguration

Source: https://developer.apple.com/documentation/arkit/building_your_first_ar_experience

OpenTok and SceneKit

SceneKit is a high level 3D API that apple introduced a couple of years ago whose main aim is to create Games or 3D visualization Apps. You could say that SceneKit is the Apple response to Unity or other high level Game Engines out there.

So, our plan was to create a scene in SCNKit (Xcode has a nice 3D scene editor built-in), and render the video content of a OpenTok session over a 3D plane (the purple one in the image below), which was placed inside a bigger cube which will act as its frame.

OpenTok and SceneKit for ARKit app

SceneKit uses both OpenGL or Metal backends to do the 3D object rendering, however, not everything works the same in both backends. For example, Metal backend only works on real devices and not in the simulator and some other features are only available if you are using Metal. This makes sense knowing that Apple is moving forward to Metal due to its better API and better performance.

Using Metal was one of the first problems when we tried to use OpenTok SDK. Our SDK uses OpenGL to render the video of a OpenTok Session in a UIView element. Since we were having tough times trying to use our default OpenGL renderer in SceneKit scenes we decided to create a Metal video renderer that we could use in this sample and in anywhere else Metal is preferred.

Metal Rendering in OpenTok

In some ways, Metal design is close to OpenGL and the concepts we used to build our OpenGL renderer are valid for building the Metal one. We just need to reformulate how we render the video.

In OpenGL, we use 2D textures as the input of a fragment shader. That fragment shader is a small program that runs on the GPU (that means high parallelization, or in other words, hundreds of mathematical operations done in parallel) and converts from the input format of the video, YUV, to another 2D texture in RGB. Then we assign that texture to a 2D UIView, and the video is renderer on the screen.

In Metal, we will use something similar. In this case we use a compute shader that will also perform hunderds of matrix multiplications in the GPU. That compute shader will take 3 2D textures for Y, U and V planes, and will write onto another 2D texture that happens to be a MTLTexture.

Guess why? SCNKit objects use SCNMaterial instances to give real apparency to SCNKit objects, and yes, SCNMaterial are formed by MTLTexture.

So we have our path clear, we will need to:

Create a custom OpenTok renderer that will receive video frames,
Those video frames will be fed to the metal compute shader
The shader will convert them to a RGB in a MTLTexture
That texture will be assigned to a SCNPlane in our scene.

Easy, right?

Metal rendering in OpenTok for ARKit app

YUV to RGB Metal compute shader

Once we have a clear view of what we want to achieve, let’s see the code we used to make it real. We will start by showing the metal shader:

kernel void YUVColorConversion(
       texture2d<uint, access::read> yTexture [[texture(0)]],
       texture2d<uint, access::read> uTexture [[texture(1)]],
       texture2d<uint, access::read> vTexture [[texture(2)]],
       texture2d<float, access::write> outTexture [[texture(3)]],
       uint2 gid [[thread_position_in_grid]])
{
    float3 colorOffset = float3(-(16.0/255.0), -0.5, -0.5);
    float3x3 colorMatrix = float3x3(
                                float3(1.164,  1.164, 1.164),
                                float3(0.000, -0.392, 2.017),
                                float3(1.596, -0.813, 0.000)
                            );

    uint2 uvCoords = uint2(gid.x / 2, gid.y / 2); // Due to UV subsampling
    float y = yTexture.read(gid).r / 255.0;
    float u = uTexture.read(uvCoords).r / 255.0;
    float v = vTexture.read(uvCoords).r / 255.0;
    float3 yuv = float3(y, u, v);
    float3 rgb = colorMatrix * (yuv + colorOffset);
    outTexture.write(float4(float3(rgb), 1.0), gid);
}

As you can see, the code of the shader is quite simple, it just reads the pixel data from the three input textures, and performs some operations to convert it from YUV colorspace to RGB. The idea behind this is very similar to what we do in our OpenGL renderer

Custom Metal renderer

In order to create a custom OpenTok Renderer, we need to create a class that conforms to the OTVideoRender protocol. That protocol has just one function func renderVideoFrame(_ frame: OTVideoFrame). As you can image, that function will be called around 10 to 30 times per second (depending on the video fps received) and every time it will pass a YUV video frame in its frame parameter.

We need to extract the video frame information from that frame, and send it to our shader. In our Swift class we will have 4 instances of MTLTexture, 3 for YUV inputs and 1 for RGB output. Starting with the RGB output, we create it using:

textureDesc = MTLTextureDescriptor.texture2DDescriptor(
                 pixelFormat: .rgba16Float,
                 width: Int(format.imageWidth),
                 height: Int(format.imageHeight),
                 mipmapped: false) 
textureDesc?.usage = [.shaderWrite, .shaderRead] 
 // device is MTLDevice instance
let outTexture = device.makeTexture(descriptor: textureDesc!)

The difference is the pixelFormat, RGB output will be rgbafloat, and Y input will be r8Uint since the Y place just has 1 byte per pixel.

In order to fill the yTexture with the data coming from the OTVideoFrame, we will do:

guard let planes = frame.planes else { return }
yTexture!.replace(
                   region: MTLRegionMake2D(0, 0,
                                         Int(format.imageWidth), Int(format.imageHeight)),
                   mipmapLevel: 0,
                   withBytes: planes.pointer(at: 0)!,
                   bytesPerRow: (format.bytesPerRow.object(at: 0) as! Int))

We will do the same for U and V textures, but taking into account that our image format has 2:2 subsampling in the U and V planes, meaning that the size of those textures is divided by 2. (Same as you could see in the shader code).

Once we have the texture data in our custom OpenTok renderer, we need to pass the three textures to the shader and effectively, run the shader. To do so, we need to issue a Metal command buffer, encode the commands and add the encoded commands to the command buffer. If you want to know more about how this works, please read official Apple documentation about Command Organization and Execution Model.

Although it could sound a little bit intimidating, as you can see, the code is not that complex:

let defaultLibrary = device.makeDefaultLibrary()
let commandQueue = device.makeCommandQueue()
let commandBuffer = commandQueue?.makeCommandBuffer()
let commandEncoder = commandBuffer?.makeComputeCommandEncoder()
let kernelFunction = defaultLibrary?.makeFunction(name:"YUVColorConversion")
let pipelineState =
                         try! device.makeComputePipelineState(function: kernelFunction!)

commandEncoder?.setComputePipelineState(pipelineState) 
commandEncoder?.setTexture(yTexture, index: 0)
commandEncoder?.setTexture(uTexture, index: 1)
commandEncoder?.setTexture(vTexture, index: 2)
commandEncoder?.setTexture(outTexture, index: 3)

commandEncoder?.dispatchThreadgroups(
                        threadgroupsPerGrid,
                        threadsPerThreadgroup: threadsPerThreadgroup)
commandEncoder?.endEncoding() 

commandBuffer?.commit()

Once we call commit, the shader will do its job, and will start processing our textures.

Linking everything together

If you remember our checklist:

Create a custom OpenTok renderer that will receive video frames,
Those video frames will be fed to the metal compute shader
The shader will convert them to a RGB in a MTLTexture
That texture will be assigned to a SCNPlane in our scene.

There is just one thing left. Assign the output of the Metal shader to the SCNMaterial of our SCNPlace in the scene.

To do that, we need to get the reference of the plane, and we will do that in our UIViewController,

let scene = SCNScene(named: "art.scnassets/opentok.scn")!
let node = scene.rootNode.childNode(withName: "plane", recursively: false)!

Once we have the node, we will send that node to the Custom Renderer we have built, and it will assign the texture to its material, by executing:

node.geometry?.firstMaterial?.diffuse.contents = outTexture
// outTexture from our custom capturer.

After this long journey around SCNKit and Metal rendering, we haven’t forgotten that this blog post is about ARKit. We are using a ARSCNView, so it has the session bundled in: we just need to run the session (and pause it!)

override func viewWillAppear(_ animated: Bool) {
  super.viewWillAppear(animated)
  let configuration = ARWorldTrackingConfiguration()
  sceneView.session.run(configuration)
}

    
override func viewWillDisappear(_ animated: Bool) {
  super.viewWillDisappear(animated)
  sceneView.session.pause()
}

Also, we need to create an OpenTok session, connect it to start viewing the video of a given subscriber in that plane, that due to ARKit will be floating in the space in our room (or wherever we run the sample). We can walk away or walk around it, and the video will appear to be living in our world.

The code in the blog post are excerpts. If you want to see the whole sample in action, don’t miss its repo at github. Using this sample as the starting point, it is very easy to add more video participants to the virtual room, or to even model a complete virtual living room where the video of your OpenTok session participants will be hanging around.

The post Build a live video app with ARKit and OpenTok appeared first on TokBox Blog.

↧

How to use Android Picture-in-Picture mode with OpenTok

October 30, 2017, 6:14 am

≫ Next: Hacking Social Video: Building a Group Live Video App: Part 2

≪ Previous: Build a live video app with ARKit and OpenTok

Picture this: you are outside in a park and attending a meeting using your Android phone with a cool OpenTok-based application, but suddenly you need to check some information from a different app on your phone. Currently, your only option would be to put the original app in the background, and stop seeing the rest of the people in the meeting while you check that information.

Well we have good news! Starting from Android O, there is an alternative. In the latest version of Android, Google added the possibility to put an Activity in “Picture-in-Picture” (PiP) mode: that means that you will be able to continue seeing the video of a OpenTok session while you are doing other things with your phone. In fact, this was already available in Android M, but it was only available on Android TV. Now, with Android O, it is available in phones too.

In this blog post we will see how you can add this feature to your application using the OpenTok SDK.

Preparing your Application to support Android Picture-in-Picture

In order to have PiP, you can use the same SDK you are already using – no changes are required from OpenTok side.

When in PiP mode, your activity will be shown on top of everything else, and in a small window, so you should plan to remove all UI elements and display just the video of the OpenTok session.

You don’t need to put anything to exit from PiP and back to fullscreen mode. Android will deal with that, adding some controls that will appear when the user touches the PiP Window.

In order to support PiP, you need to add it explicitly to the activity you want to be displayed in PiP mode, as you would have guessed, the PiP mode is set per-activity. Usually in an OpenTok application, the activity you want to add PiP support is the one which is dealing with the session video stream.

You need to add three things in your activity entry in the manifest file:

  android:resizeableActivity="true"

  android:supportsPictureInPicture="true"

  android:configChanges=

        "screenSize|smallestScreenSize|screenLayout|orientation"

Besides the supportsPictureInPicture, it is important to add the resizableActivity attribute and the configChanges, so your activity won’t be recreated every time it enters or exits PiP mode.

Entering Picture-In-Picture mode

Once you have declared that your activity is ready to be displayed in PiP mode, you just need to have some UI element or Event that will trigger this mode in the Activity. Fortunately, this is quite simple: the Activity class has a method called enterPictureInPictureMode() that, without any surprise, will put the activity in Picture-in-Picture mode.

If you try to call that method, you will see that it is deprecated. This is because it was there in Android TV and now the recommended way is to build some parameters that will customize the PiP experience.

The parameters are controlled by building an instance of PictureInPictureParams class. That class will allow you to set, for example, the aspect ratio of the PiP window where your activity will be displayed. If your phone is in portrait mode, you will want to set an aspect ratio similar to the phone screen aspect ratio:

PictureInPictureParams params = new PictureInPictureParams.Builder()

       .setAspectRatio(new Rational(9,16))

       .build();

Using this class you can also tweak the controls that will be shown while in PiP. See the documentation for more details

To be in PiP or not to be

Android Picture-in-Picture with OpenTok live video call

The last thing that we need to do to have a cool video chat application that will allow you to see the other peer while browsing your Twitter timeline is to adapt the user interface for these different modes.

Like the rest of the things we’ve done, this is also pretty easy. You need to @Override a method that will be called when the mode is changed: that method is void onPictureInPictureModeChanged(boolean is InPictureInPictureMode, Configuration newConfig).

When the activity is entering PiP mode, that method will be invoked, and there is where you will do things like hiding the action bar, hiding the publisher view or hiding any button that is not useful in PiP. Take into account that the size of the PiP window is quite small, so you probably only want to have the subscriber or subscribers video on it.

For example this is the code we use in the sample code that illustrates this blogpost:

@Override
public void onPictureInPictureModeChanged(boolean isInPictureInPictureMode,
Configuration newConfig) {
   super.onPictureInPictureModeChanged(isInPictureInPictureMode, newConfig);
   if (isInPictureInPictureMode) {
       findViewById(R.id.button).setVisibility(View.GONE);
       mPublisherViewContainer.setVisibility(View.GONE);
       publisher.getView().setVisibility(View.GONE);
       getActionBar().hide();
   } else {
       findViewById(R.id.button).setVisibility(View.VISIBLE);
       mPublisherViewContainer.setVisibility(View.VISIBLE);
       publisher.getView().setVisibility(View.VISIBLE);
       if (publisher.getView() instanceof GLSurfaceView) {
           ((GLSurfaceView)publisher.getView()).setZOrderOnTop(true);
       }
       getActionBar().show();
   }
}

See that we hide the action bar (be careful about using AppCompatActivity and getSupportActionBar) and the publisher container.

Apart from the UI elements themselves, OpenTok SDK offers a method to pause a session when the activity is paused, however when PiP is in play, you probably don’t want to pause a session. When you call session.onPause, you unsubscribe from the video stream from your subscribers, but in PiP mode you still want to see the video of your peers.

Activity has a handy isInPictureInPictureMode method that will tell you if the activity is in that mode or not, so your new onPause/onResume method will now look something like this:

@Override

protected void onPause() {
   super.onPause();
   if (!isInPictureInPictureMode()) {
       if (session != null) {
           session.onPause();
       }
   }
}
@Override
protected void onResume() {
   super.onResume();
   if (!isInPictureInPictureMode()) {
       if (session != null) {
           session.onResume();
       }
   }
}

Time to get building

At this point, you should be able to add this cool feature to your app. Android handles most of the process and requires few changes from your side so this is a win/win situation.

We have built a simple sample code that illustrates the process of adding Picture-in-Picture to an OpenTok-based Android application.

You can start building for free on the OpenTok platform by signing up for an account here, and if you’re looking for more information on building with live video, why not check out our posts on building social video apps (Part 1 and Part 2)? You might also find our post about using OpenTok with Kotlin for Android useful.

The post How to use Android Picture-in-Picture mode with OpenTok appeared first on TokBox Blog.

↧

Hacking Social Video: Building a Group Live Video App: Part 2

July 26, 2017, 9:00 am

≫ Next: Creating a true voice call experience with OpenTok and CallKit

≪ Previous: How to use Android Picture-in-Picture mode with OpenTok

Part 2 – Creating the best possible user experience for social video apps

In Part 1, we looked at some of the key considerations for building a group live video app for mobile along the lines of Houseparty and Facebook Bonfire, and how the OpenTok platform can provide the solutions to some of the hurdles caused by using WebRTC off-the-shelf. In Part 2, we’ll look at some specific features and code which can be used to create an awesome user experience so your users will fall in love with the app.

Give your users the best possible video experience

Further features to enhance the social experience

Besides the benefits of using OpenTok in the client code over raw WebRTC, there are other benefits in the server side that will make your solution look better to the end user. In Part 1, we talked about the network savings of having the OpenTok server as the distributor of all the media, but having a server in the middle enables it to add things like session recording and archiving, media adaptations and better congestion detection algorithms. It also allows for intelligent quality controls and manageability tools which make it ideal for building a business.

One of the improvements that our servers brings to user experience is simulcast. Simulcast allows a client to send several streams with different qualities, and allows a receiver to receive the best quality based on his network speed or CPU constraints.

opentok simulcast mobile group live video

If you don’t use simulcast, all clients will receive the same quality stream, and that quality will be determined by the most restricted device in the session.

As social video apps are used in a wide range of settings, this feature means that you are constantly providing the best possible stream quality to all participants, whether they are in a coffee shop, their bedroom or out and about.

Must-dos when using OpenTok for social apps

We have seen a lot of technical concepts and benefits that using OpenTok will bring over using other technologies like raw WebRTC. Many of them are very important if you plan to build a social app with media communication.

We have already seen the benefits of using Simulcast to deliver as high quality a stream as possible to callers. Enabling simulcast is something that is done at server level – you don’t need to change any client code to enable it, however there are several things that you should do at client level.

OpenTok SDK’s Subscriber class has a couple of methods to set the preferred resolution and the preferred framerate. When using Simulcast, calling this will give insights about which stream a particular client wants to receive. For example, if your apps renders the peers video view in a layout that gives just 200 by 200 pixels, it is wise to tell the server to send you the stream with dimensions closest to that. If you receive a 1280 by 720 pixel stream you will be wasting resources since you will render it on a smaller view size. In order to do that you can do something like this:

// Swift code
// OTSessionDelegate
func session(_ session: OTSession, streamCreated stream: OTStream) {
  let subscriber = OTSubscriber(stream: stream, delegate: self)
  subscriber.view.frame = CGRect(x: 0, y: 0, width: 200, height: 200)
  subscriber.preferredResolution = CGSize(width: 200, height: 200)
  subscriber.preferredFrameRate = 10
}

You can change the resolution and the framerate as many times you need it, and the server will send you the stream more appropriate for each change.

In order to save resources it is advisable that you set both properties every time the subscriber view size changes. If, for example, your app allows some way to feature a single subscriber, you can increase the resolution when featuring the subscriber.

Another important thing to take into account is to disable the subscriber video if a particular subscriber goes out of the screen. Your app might not want to display all subscribers at the same time in the screen. If that happens, you should make sure that you are calling subscriber.subscribeToVideo = false for that subscriber. Setting the value to false saves the CPU from doing some work since the video will not be processed.

Creating your User Interface for Group Live Video with OpenTok

One of the key elements of building a successful mobile application is to offer an attractive user interface that will engage the users.

When using the OpenTok SDK, you will have every video stream offered like a regular view so you can do whatever you do with the rest of them in the platform. In the case of iOS you will have a View element, so you can apply UIAnimation animations or attach it to any components which lays out views.

One of the best choices is to use an UICollectionView where you can set up your views in many ways since it supports specifying a custom layout for them. If you want to see it in action, a good choice is to take a look at our sample app. This sample is using a custom layout to display all the views of the participants of a session.

In the case of Android, the picture is more or less similar but there are two options when rendering the view. We use OpenGL to render the video streams, and Android offers two ways of drawing it, using GLSurfaceView and using TextureViews. Usually TextureView is more flexible in terms of view layouts, since it behaves like a regular View, however GLSurfaceView is slightly more efficient in terms of rendering. When using OpenTok, you can specify when creating a Session which of the two view backends you want to use, as in the following piece of code:

Session s = new Session(MainActivity.this, APIKEY, SESSION_ID, new Session.SessionOptions() {
   @Override
   public boolean useTextureViews() {
       return true;
   }
});

For displaying the Views, in Android the best option is to use the brand new ConstraintsLayout where you can build a rich and responsible interface.

multiparty constraint layout group live video Group live video multiparty layout

If you want to have a taste of using ConstraintLayout with OpenTok, you can take a look at our sample app here.

Many of the fastest growing social apps in the world use the OpenTok platform for group live video, such as Houseparty, live.ly and Monkey – apps with a truly global reach. If you’re interested in joining their ranks and building your own social app, keep an eye out for our next post, which will teach you how to use CallKit to send notifications to a phone.

And if you can’t wait that long for the next instalment, check out this post which will teach you how to add Snapchat-like filters to your web application, and have a look at our tutorials in the Developer center which will have you up and running in no time. We love social!

The post Hacking Social Video: Building a Group Live Video App: Part 2 appeared first on TokBox Blog.

↧

Creating a true voice call experience with OpenTok and CallKit

July 31, 2017, 10:00 am

≫ Next: How we test WebRTC live video sessions for massive audiences

≪ Previous: Hacking Social Video: Building a Group Live Video App: Part 2

In previous posts, we have looked at some key considerations for building apps for social video calling on mobile devices, and some of the features you can include to make sure your users have a great experience.

Here, we’re going to look in depth at CallKit, a framework for iOS which is an important component for creating frictionless, delightful apps, especially voice and video calling apps.

Introducing CallKit and why it’s important

Last year at WWDC 2016, Apple announced a new VoIP framework: CallKit. This new framework allows developers to integrate VoIP calling services with other call-related apps on iOS. Simply put, your own application calling services can now have the same system priority as native phone calls.

callkit opentok webrtc voice and video

Before CallKit, a app VoIP call could easily be dropped by an incoming native phone call without warning. Now users have the control to either continue or stop the call through the native calling user interface.

TokBox has been powering real-time audio and video communication over the web for a decade, and we are glad to have this new functionality to improve the calling experience for native iOS apps. In the real-time communication business, quality and continuity are often critical criteria, and even small inconveniences can lead to user frustration. CallKit provides a much-needed improvement to continuity, and will undoubtedly help developers (including those building TokBox apps) create more enjoyable user experiences as the mobile market continues to advance.

CallKit

The sample app below will show you how to integrate CallKit with OpenTok iOS SDK. With CallKit, your app will now be able to:

Use the native incoming call UI in both locked and unlocked state.
Interact with other calls in the system.

As a developer, you will mostly use two primary classes: CXProvider and CXCallController.

CXProvider

A CXProvider object is responsible for reporting out-of-band notifications that occur to the system. Whenever such an event occurs, CXProvider internally creates a CXCallUpdate object to notify the system. A CXCallUpdate encapsulates new or updated call-related information which exposes properties such as the caller’s id, or whether it’s an audio-only, or a video call etc. The app communicates with CXProvider through the core protocol: CXProviderDelegate, which defines methods for provider lifecycle events and telephony actions. The templates below are shown in Swift where the different types of the action parameter distinguish the different func methods.

// MARK: CXProviderDelegate
func providerDidReset(_ provider: CXProvider) {
    print("Provider did reset")
}

func provider(_ provider: CXProvider, perform action: CXStartCallAction) {
    print("Provider performs the start call action")
    // Configure audio session but do NOT start audio until session activated
}

func provider(_ provider: CXProvider, perform action: CXAnswerCallAction) {
    print("Provider performs the answer call action")
   // Configure audio session but do NOT start audio until session activated
}

Note the importance of getting the timing right with the above actions. We can configure the audio session and other information, but we should not start call audio until the audio session has been activated by the system after having its priority elevated – see AVAudioSession below.

func provider(_ provider: CXProvider, perform action: CXEndCallAction) {
    print("Provider performs the end call action")
    // Trigger the call to be ended via the underlying network service.
}

func provider(_ provider: CXProvider, perform action: CXSetHeldCallAction) {
    print("Provider performs the hold call action")
}

func provider(_ provider: CXProvider, perform action: CXSetMutedCallAction) {
    print("Provider performs the mute call action")
}

The following methods indicate whether your app’s call has successfully had its priority boosted or recovered:

func provider(_ provider: CXProvider, timedOutPerforming action: CXAction) {
    print("Timed out \(#function)")
    // React to the action timeout if necessary, such as showing an error UI.
}

func provider(_ provider: CXProvider, didActivate audioSession: AVAudioSession) {
    // Start call audio media, now that the audio session has been activated
    // after having its priority boosted.
}

func provider(_ provider: CXProvider, didDeactivate audioSession: AVAudioSession) {
    // Restart any non-call related audio now that the app's audio session has been
    // de-activated after having its priority restored to normal.
}

CXCallController

Let’s explore how to make a call and answer a call on behalf of a user. To do that, we need a CXCallController object to interact with the system.

The CXCallController object takes a CXTransaction object to request a telephony action (which will later trigger delegate methods above if successful). To specify a telephony action in a transaction, you need to create your desired action object and associate them with the transaction. Each telephony action has a corresponding CXAction class such as CXEndCallAction for ending a call, or CXSetHeldCallAction for putting a call on hold.

Once you have it all ready, invoke the request(_:completion:) by passing a ready transaction object. Here’s how you start a call:

callkit opentok voice call

// create a CXAction
let startCallAction = CXStartCallAction(call: UUID(),
  handle: CXHandle(type: .phoneNumber, value: handle))
// create a transaction
let transaction = CXTransaction()
transaction.addAction(startCallAction)

// create a label
let action = "startCall"

callController.request(transaction) { error in
    if let error = error {
        print("Error requesting transaction: \(error)")
    } else {
        print("Requested transaction \(action) successfully")
    }
}

Present Native Incoming Call Screen

callkit opentok webrtc

// Construct a CXCallUpdate describing the incoming
// call, including the caller.
let update = CXCallUpdate()

update.remoteHandle =
  CXHandle(type: .phoneNumber, value: handle)


// Report the incoming call to the system
provider.reportNewIncomingCall(with: uuid, update: update)
{ error in
    // Only add incoming call to the app's list of calls if
    // call was allowed (i.e. there was no error) since calls
    // may be "denied" for various legitimate reasons.
    // See CXErrorCodeIncomingCallError.
}

For testing purpose, you can define your own customer behavior to present the native calling screen. Often, this piece of code works with a VoIP remote notification to make a call to a specific device/person, as WhatsApp, WeChat and Messenger do.

Voice-only configuration with OpenTok

By default, OpenTok iOS SDK requires camera and microphone for video and audio communication. For voice-only apps, you might not want to acquire camera access. You can easily achieve this by setting the videoTrack property to false :

if publisher == nil {
    let settings = OTPublisherSettings()
    settings.name = UIDevice.current.name
    settings.audioTrack = true
    settings.videoTrack = false
    publisher = OTPublisher.init(delegate: self, settings: settings)
}
var error: OTError?
session?.publish(publisher!, error: &error)
if error != nil {
    print(error!)
}

Of course, you don’t have to receive the video stream either:

subscriber = OTSubscriber.init(stream: stream, delegate: self)
subscriber?.subscribeToVideo = false
if let subscriber = subscriber {
    var error: OTError?
    session.subscribe(subscriber, error: &error)
    if error != nil {
        print(error!)
    }
}

Download the Sample App

Based on the Apple CallKit sample app Speakbox, we have developed a sample app to demonstrate how to integrate CallKit with OpenTok to create a true video call experience. With the sample app, you can try out how to actually make an outbound call and simulate an incoming call in both locked and unlocked state with OpenTok.

Unfortunately, a small issue prevents the sample from working fully. It turns out that the audio session does not function normally if a call is accepted from the locked state. Thanks to the brilliant folk over at Apple, they suggest developers set up the audio session as early as possible to temporarily make the case work. Here is the information: https://forums.developer.apple.com/thread/64544.

callkit demo voice call app

Conclusion and Next Steps

Isn’t it great to make your app behave like a native phone call and redirect your app when possible? Now you can do it with CallKit!

While CallKit won’t fundamentally change your iOS app, the improved continuity will undoubtedly help you provide a more seamless user experience for your users.

Note: The application above requires values for TokBox API key, session ID, and token. During development and testing, you can get these values from the project page in your TokBox Account. For production deployment, you need to generate the session ID and token values using one of the OpenTok Server SDKs.

If you don’t have a TokBox account, sign up for your free trial here.

For an overview of the benefits of using the OpenTok platform for building richer applications, check out our series on social video apps:

Infographic on the World of Social Video Chat
Building a Group Live Video App Part 1
Building a Group Live Video App Part 2
How to add Snapchat-like filters to a Web App

Have fun!

The post Creating a true voice call experience with OpenTok and CallKit appeared first on TokBox Blog.

↧

How we test WebRTC live video sessions for massive audiences

September 13, 2017, 9:00 am

≫ Next: Welcome Windows Developers!

≪ Previous: Creating a true voice call experience with OpenTok and CallKit

Co-authored by Tiffany Walsh, Patrick Quinn-Graham, and Michael Sander.

We recently announced the launch of our large-scale Interactive broadcast capabilities, including the option to publish to a wide variety of endpoints via RTMP and HTTP Live streaming. You can now broadcast to an audience of up to 3000 real-time viewers – and we know because we’ve been testing it to get it perfect, so that you don’t have to. Want to know how we test WebRTC sessions for huge audiences even though we only have a handful of people on our team? Read on!

WebRTC session types

Testing interactive audio and/or video live sessions, with a small to medium number of participants is relatively easy, since a publisher (media source, i.e. microphone, webcam, etc.) or a subscriber (media sink, i.e. web browser, native app with widget to visualize video, headphones, etc.) takes an acceptable portion of the resources in a device (laptop, mobile, tablet…), and we can have several of them at the same time.

Here’s a quick set of examples of some of the types of sessions we can create with OpenTok:

1-to-1: Two people, each one sending audio and/or video, and receiving the other’s.

Many-to-Many: Several people, each one sending audio and/or video, and receiving all audio and video streams of the other participants.

One-way Broadcasting: The best examples of this type are TV and radio broadcasts, for instance live sports or news. There is a very limited number of media sources (i.e. one in the arena, another on the TV set, and a reporter on the street), with a potentially huge audience (in the order of hundreds of millions), like you would get for a Super Bowl match. With TokBox, these sessions can be created using HLS or RTMP streams. You can find more information about these types of sessions in our blog posts about Crowdcast and Future of webinars.

Interactive broadcast: There are a few publishers and a medium to large real-time audience, up to a total of 3000 streams in a session. At certain points, an audience member can go “on stage” and publish, so all others can see and hear them. A talk show format where there is an interview with a celebrity and people at home can go live for a few seconds and ask a question is a real use case, for instance, the MLB Chatting Cage.

How we test WebRTC session types

As we’ve already talked about, creating test sessions of types 1 (1-to-1 ) and 2 (many-to-many) in a controlled test environment is fairly simple, and it only requires a minimal amount of resources that remains low, with just a few laptops, mobiles, tablets or other devices needed.

Moving to type 3, one-way broadcast, doesn’t actually impose further constraints, since we only need a few publishers and, optionally, one or more subscribers. Once the HLS or RTMP stream is created, all we need to do is connect it to a very small number of clients (a tab in the web browser, or any other application). Quality in this stream is provided by the underlying CDN, so there will be no difference in creating 1 client or 1000.

All these tests don’t look very exciting, do they? The actual challenge arises when we come up with the need to test WebRTC for medium to large and massive interactive broadcast sessions, where more than 100 (and up to several thousands) subscribers are connected to the same one. It is obvious that the approach of having a number of laptops and mobile devices on a table, and connecting them manually to the session is not enough. Even having this automated in some way falls short, since the huge number of clients requires a large amount of hardware. We cannot have all the clients in the same machine, nor can we visually check the quality for those publishers and subscribers.

Dropbear to the rescue

Here is where our Dropbear tool comes to the rescue. It is a tool developed internally at TokBox, designed to be able to deal with an arbitrary number of sessions, each of which with an arbitrary number of publishers and subscribers that simulate real clients in mobile or fixed devices. Traffic generated by them is realistic in both protocol (signalling, RTP/RTCP) and media aspects (VP8 video and Opus audio), which allows us to extract valid conclusions regarding platform quality and performance.

The general architecture used in Dropbear can be seen in the following diagram:

test webrtc dropbear framework for large session

There is a master node where the user connects via ssh. There, we configure some important environment variables, such as the OpenTok API URL, the Amazon AMI that will be run in the workers, or the API key and secret to be used to create sessions. Then, we run the tool, complementing configuration with the number and types of sessions, number of publishers per session, subscribers, etc.

The master node then calculates the number of CPU cores required to run the test, based on the aggregated number of publishers and subscribers for all concurrent sessions. These publishers and subscribers are implemented by workers, with a relation of one worker per publisher/subscriber. Workers are spread amongst EC2 boxes, with several workers per core. This means that, if, for example, we can have 2 workers per core, in an 8-core machine, running a test with 1 publisher and 2000 subscribers would require:

NUM_BOXES = CEIL((NPUBLISHERS + NSUBSCRIBERS) / WORKERS PER BOX) = CEIL((1 + 2000) / 16) = CEIL(125.06) = 126

A total of 126 boxes will be needed to run this specific test.

In order to implement this stage, we use EC2 instances, which are spun up and down on demand, using Spot Fleet, requesting 1001 cores (made up of whatever instance types are available) and that are shut down automatically at the end of the tests, when they are not required anymore, or after an activity timeout, whichever comes first (to prevent leaving running instances without actually being used).

Dropbear uses more CPU than memory, so we opt for the “compute optimized” EC2 instance types – c4.2xlarge up to the c4.8xlarge. The latter type allows a much higher number of workers per box, but at a higher cost. Spot Fleet is used because it prefers the cheapest (per core) instance type in the availability zone with the cheapest spot price.

Regarding workers, they are implemented in javascript, and run inside a docker container. This gives us a reliable and reproducible environment – that is described in code. These containers instantiate an instance of a custom built CentOS 7 image, which contains specific versions of Node.js and a node module that wraps our native SDK (the same core that powers our iOS, Android and Windows SDKs).

While the test runs, logs generated by the workers are sent to an ElasticSearch server, where they can be consumed in real time (or after the test) through Kibana.

When the test finishes, all workers notify the master node, and a report is then generated for each session created. This report provides a lot of useful information, such as average bitrates, packet loss and connection times.

Since we’ve done the testing for you, you can get on with developing your interactive broadcast application and make plans for spreading the word to as wide an audience as possible. To get started with the OpenTok platform, sign up for an account including $10 credit here.

Our Developer Center is full of resources to help you get up and running, and you can get in touch if you have any questions.

The post How we test WebRTC live video sessions for massive audiences appeared first on TokBox Blog.

↧

Welcome Windows Developers!

September 21, 2017, 4:13 am

≫ Next: Introducing our brand new Developer Evangelist!

≪ Previous: How we test WebRTC live video sessions for massive audiences

We’re happy to say that last week, as part of the 2.12 release of our client SDKs, we graduated the OpenTok Windows Client SDK out of beta. This SDK allows Windows developers to add rich, immersive video chat to Windows native applications and to really let their creativity flow.

Yes, the PC lives on

The technology industry is always focused on the next big platform. The buzz right now is centered on virtual reality and augmented reality. In addition, there is continued interest in IoT and given the ubiquity of smart phones, the mobile platform is very much top of mind for developers. However, there were 269M PCs shipped globally in 2016 and over 1.4B devices are estimated to be running Windows 7 or higher worldwide. The majority of knowledge workers today continue to use PCs as their primary device for their day-to-day work.

What can the Windows SDK do for you?

The OpenTok Windows SDK exposes a C# API and allows developers to create collaboration experiences that leverage the performance and rich UI elements of the operating system in a way that is not possible with browser based web applications. In addition, access to the frame buffer of the display makes screensharing with remote participants simple. The API also allows for custom video capturing and rendering implementations giving developers the flexibility to develop truly creative solutions.

What’s next?

Here at TokBox, our goal is to serve developers regardless of what platform they are developing on. In addition to our OpenTok Windows Client SDK, we are developing an SDK for macOS. For an early version contact us at sdk-beta@tokbox.com.

We are delighted to make our Windows Client SDK available to Windows developers and we are eager to see the creative applications you come up with. Ready to get started? Sign up for a trial account with $10 free credit here.

The post Welcome Windows Developers! appeared first on TokBox Blog.

↧

Introducing our brand new Developer Evangelist!

October 2, 2017, 10:13 am

≫ Next: Build WebRTC desktop apps with OpenTok and Electron

≪ Previous: Welcome Windows Developers!

Hey everyone, my name is Manik and I’m happy to say I’ve recently joined TokBox as the Developer Evangelist.

I’m extremely excited to be a part of a company that’s creating an ecosystem for real-time communication. OpenTok allows users to take full control of implementing video and audio chat, as well as messaging, into their Web, iOS, Android, and Windows products. It’s awesome to see how developers all over the world are using the API and I’m looking forward to seeing many more creative ideas in the future.

In my journey as the Developer Evangelist, I want to be a liaison between the developer community and TokBox in an effort to constantly improve the developer journey and OpenTok as a whole. To do so, I’ll be focusing on the following areas:

Meeting with community members to gain insightful feedback and expand TokBox’s global footprint
Hosting a series of events to help developers use and make the most of the OpenTok API
Creating videos and tutorials outlining the best practices for OpenTok API and tools
Working with community members to maintain and add support for third party frameworks

opentok developer evangelist webrtc As you continue to use OpenTok to add real-time communication to your apps, I look forward to working with you and bringing your ideas to life. Please feel free to reach out to me on LinkedIn or Twitter and share what you are working on!

And if you aren’t working on a project yet, what are you waiting for? Check out our Developer center and get started with our Video Chat Embed to see for yourself just how easy it is to get started with real-time communications.

The post Introducing our brand new Developer Evangelist! appeared first on TokBox Blog.

↧

Build WebRTC desktop apps with OpenTok and Electron

October 4, 2017, 5:44 am

≫ Next: How to add a video chat embed to your Wix website

≪ Previous: Introducing our brand new Developer Evangelist!

The Electron Framework

Who is Electron For?

OpenTok and Electron

To start building the OpenTok Electron sample application, clone the OpenTok Web Samples Github repository.

git clone git@github.com:opentok/opentok-web-samples.git

You’ll notice that the OpenTok sample code for adding a video chat is almost identical to the web sample app code, but with some slight changes such as adding Electron to the application.

Now, install both the OpenTok and Electron libraries by running `npm install` in your terminal.

// Set Credentials
const apiKey = "";
const sessionId = "";
const token = "";

To see the app in action, start the application by running `npm start`

Electron for OpenTok sample app

What’s Next?

If you’re interested in implementing your own messaging and social apps like Slack and Snapchat, check out the following posts:

The post Build WebRTC desktop apps with OpenTok and Electron appeared first on TokBox Blog.

↧

How to add a video chat embed to your Wix website

October 12, 2017, 11:11 am

≫ Next: Build a live video app with ARKit and OpenTok

≪ Previous: Build WebRTC desktop apps with OpenTok and Electron

Wix
Squarespace
Zoho Sites
Weebly (Business and above)
WordPress hosted account (paid)

Who are Video Chat Embeds for?

If you’re an educator, doctor, support agent, blogger or someone who wants to add a 1:1 video chatting to your website then embeds are for you!

Using Video Chat Embeds

You can access the video embed for our music Wix site here: https://manik669.wixsite.com/musiclessons/live-lessons

Adding Code to Wix

Log into Wix.
Click “Manage Site”.
Click “Edit Site”.
Select the page where you want to add the video chat
On the widget on your left, click on the “+(Add)” icon.
At the bottom, click “More”.
Under “HTML & Flash”, click “HTML Code”.
- This will create a small widget on your site editor.
- Place this widget where you’d like your video chat to be on your page
Click “Edit Code” button on the widget and add select the “Code” option.
Add your OpenTok Video Chat Embed code in the text box below and click “Update”.
After adding your code, click “Publish” on the top right corner to deploy your changes live.

An example of the video chat embed and the HTML Settings in edit mode will look like this:

Add video chat to your Wix website

Live music lesson with video chat embeds

What’s Next?

The post How to add a video chat embed to your Wix website appeared first on TokBox Blog.

↧

Build a live video app with ARKit and OpenTok

October 17, 2017, 2:41 pm

≫ Next: How to use Android Picture-in-Picture mode with OpenTok

≪ Previous: How to add a video chat embed to your Wix website

An intro to AR

OpenTok and ARKit

In this blog post we will describe how you can use ARKit to show the video of your session participants.

All the code of this blog comes from a ARKit sample that you can find here.

ARKit primer

Creating Virtual Elements

ARSession and ARSessionConfiguration

Source: https://developer.apple.com/documentation/arkit/building_your_first_ar_experience

OpenTok and SceneKit

OpenTok and SceneKit for ARKit app

Metal Rendering in OpenTok

In some ways, Metal design is close to OpenGL and the concepts we used to build our OpenGL renderer are valid for building the Metal one. We just need to reformulate how we render the video.

Guess why? SCNKit objects use SCNMaterial instances to give real apparency to SCNKit objects, and yes, SCNMaterial are formed by MTLTexture.

So we have our path clear, we will need to:

Create a custom OpenTok renderer that will receive video frames,
Those video frames will be fed to the metal compute shader
The shader will convert them to a RGB in a MTLTexture
That texture will be assigned to a SCNPlane in our scene.

Easy, right?

Metal rendering in OpenTok for ARKit app

YUV to RGB Metal compute shader

Once we have a clear view of what we want to achieve, let’s see the code we used to make it real. We will start by showing the metal shader:

kernel void YUVColorConversion(
       texture2d<uint, access::read> yTexture [[texture(0)]],
       texture2d<uint, access::read> uTexture [[texture(1)]],
       texture2d<uint, access::read> vTexture [[texture(2)]],
       texture2d<float, access::write> outTexture [[texture(3)]],
       uint2 gid [[thread_position_in_grid]])
{
    float3 colorOffset = float3(-(16.0/255.0), -0.5, -0.5);
    float3x3 colorMatrix = float3x3(
                                float3(1.164,  1.164, 1.164),
                                float3(0.000, -0.392, 2.017),
                                float3(1.596, -0.813, 0.000)
                            );

    uint2 uvCoords = uint2(gid.x / 2, gid.y / 2); // Due to UV subsampling
    float y = yTexture.read(gid).r / 255.0;
    float u = uTexture.read(uvCoords).r / 255.0;
    float v = vTexture.read(uvCoords).r / 255.0;
    float3 yuv = float3(y, u, v);
    float3 rgb = colorMatrix * (yuv + colorOffset);
    outTexture.write(float4(float3(rgb), 1.0), gid);
}

Custom Metal renderer

textureDesc = MTLTextureDescriptor.texture2DDescriptor(
                 pixelFormat: .rgba16Float,
                 width: Int(format.imageWidth),
                 height: Int(format.imageHeight),
                 mipmapped: false) 
textureDesc?.usage = [.shaderWrite, .shaderRead] 
 // device is MTLDevice instance
let outTexture = device.makeTexture(descriptor: textureDesc!)

The difference is the pixelFormat, RGB output will be rgbafloat, and Y input will be r8Uint since the Y place just has 1 byte per pixel.

In order to fill the yTexture with the data coming from the OTVideoFrame, we will do:

guard let planes = frame.planes else { return }
yTexture!.replace(
                   region: MTLRegionMake2D(0, 0,
                                         Int(format.imageWidth), Int(format.imageHeight)),
                   mipmapLevel: 0,
                   withBytes: planes.pointer(at: 0)!,
                   bytesPerRow: (format.bytesPerRow.object(at: 0) as! Int))

Although it could sound a little bit intimidating, as you can see, the code is not that complex:

let defaultLibrary = device.makeDefaultLibrary()
let commandQueue = device.makeCommandQueue()
let commandBuffer = commandQueue?.makeCommandBuffer()
let commandEncoder = commandBuffer?.makeComputeCommandEncoder()
let kernelFunction = defaultLibrary?.makeFunction(name:"YUVColorConversion")
let pipelineState =
                         try! device.makeComputePipelineState(function: kernelFunction!)

commandEncoder?.setComputePipelineState(pipelineState) 
commandEncoder?.setTexture(yTexture, index: 0)
commandEncoder?.setTexture(uTexture, index: 1)
commandEncoder?.setTexture(vTexture, index: 2)
commandEncoder?.setTexture(outTexture, index: 3)

commandEncoder?.dispatchThreadgroups(
                        threadgroupsPerGrid,
                        threadsPerThreadgroup: threadsPerThreadgroup)
commandEncoder?.endEncoding() 

commandBuffer?.commit()

Once we call commit, the shader will do its job, and will start processing our textures.

Linking everything together

If you remember our checklist:

Create a custom OpenTok renderer that will receive video frames,
Those video frames will be fed to the metal compute shader
The shader will convert them to a RGB in a MTLTexture
That texture will be assigned to a SCNPlane in our scene.

There is just one thing left. Assign the output of the Metal shader to the SCNMaterial of our SCNPlace in the scene.

To do that, we need to get the reference of the plane, and we will do that in our UIViewController,

let scene = SCNScene(named: "art.scnassets/opentok.scn")!
let node = scene.rootNode.childNode(withName: "plane", recursively: false)!

Once we have the node, we will send that node to the Custom Renderer we have built, and it will assign the texture to its material, by executing:

node.geometry?.firstMaterial?.diffuse.contents = outTexture
// outTexture from our custom capturer.

override func viewWillAppear(_ animated: Bool) {
  super.viewWillAppear(animated)
  let configuration = ARWorldTrackingConfiguration()
  sceneView.session.run(configuration)
}

    
override func viewWillDisappear(_ animated: Bool) {
  super.viewWillDisappear(animated)
  sceneView.session.pause()
}

The post Build a live video app with ARKit and OpenTok appeared first on TokBox Blog.

↧

How to use Android Picture-in-Picture mode with OpenTok

October 30, 2017, 6:14 am

≫ Next: How Calgary Scientific Drives Real-Time Collaboration with the Windows SDK

≪ Previous: Build a live video app with ARKit and OpenTok

In this blog post we will see how you can add this feature to your application using the OpenTok SDK.

Preparing your Application to support Android Picture-in-Picture

In order to have PiP, you can use the same SDK you are already using – no changes are required from OpenTok side.

When in PiP mode, your activity will be shown on top of everything else, and in a small window, so you should plan to remove all UI elements and display just the video of the OpenTok session.

You don’t need to put anything to exit from PiP and back to fullscreen mode. Android will deal with that, adding some controls that will appear when the user touches the PiP Window.

You need to add three things in your activity entry in the manifest file:

  android:resizeableActivity="true"

  android:supportsPictureInPicture="true"

  android:configChanges=

        "screenSize|smallestScreenSize|screenLayout|orientation"

Besides the supportsPictureInPicture, it is important to add the resizableActivity attribute and the configChanges, so your activity won’t be recreated every time it enters or exits PiP mode.

Entering Picture-In-Picture mode

PictureInPictureParams params = new PictureInPictureParams.Builder()

       .setAspectRatio(new Rational(9,16))

       .build();

Using this class you can also tweak the controls that will be shown while in PiP. See the documentation for more details

To be in PiP or not to be

Android Picture-in-Picture with OpenTok live video call

For example this is the code we use in the sample code that illustrates this blogpost:

@Override
public void onPictureInPictureModeChanged(boolean isInPictureInPictureMode,
Configuration newConfig) {
   super.onPictureInPictureModeChanged(isInPictureInPictureMode, newConfig);
   if (isInPictureInPictureMode) {
       findViewById(R.id.button).setVisibility(View.GONE);
       mPublisherViewContainer.setVisibility(View.GONE);
       publisher.getView().setVisibility(View.GONE);
       getActionBar().hide();
   } else {
       findViewById(R.id.button).setVisibility(View.VISIBLE);
       mPublisherViewContainer.setVisibility(View.VISIBLE);
       publisher.getView().setVisibility(View.VISIBLE);
       if (publisher.getView() instanceof GLSurfaceView) {
           ((GLSurfaceView)publisher.getView()).setZOrderOnTop(true);
       }
       getActionBar().show();
   }
}

See that we hide the action bar (be careful about using AppCompatActivity and getSupportActionBar) and the publisher container.

Activity has a handy isInPictureInPictureMode method that will tell you if the activity is in that mode or not, so your new onPause/onResume method will now look something like this:

@Override

protected void onPause() {
   super.onPause();
   if (!isInPictureInPictureMode()) {
       if (session != null) {
           session.onPause();
       }
   }
}
@Override
protected void onResume() {
   super.onResume();
   if (!isInPictureInPictureMode()) {
       if (session != null) {
           session.onResume();
       }
   }
}

Time to get building

At this point, you should be able to add this cool feature to your app. Android handles most of the process and requires few changes from your side so this is a win/win situation.

We have built a simple sample code that illustrates the process of adding Picture-in-Picture to an OpenTok-based Android application.

The post How to use Android Picture-in-Picture mode with OpenTok appeared first on TokBox Blog.

↧

How Calgary Scientific Drives Real-Time Collaboration with the Windows SDK

November 15, 2017, 9:54 am

≫ Next: The Cordova OpenTok Plugin: Live video for iOS and Android applications

≪ Previous: How to use Android Picture-in-Picture mode with OpenTok

As part of our 2.12 client release, we were delighted to graduate our OpenTok Windows Client SDK out of beta and make it generally available. In addition to being a standard client endpoint for the OpenTok platform, the Windows SDK makes it easier to go beyond regular video chat and create whole new level of collaboration experiences.

Creativity & magic

The creativity of developers never ceases to amaze me and when you provide developers with powerful tools like our Windows Client SDK, magical things happen. A few weeks ago, I was fortunate to speak to the team at Calgary Scientific. Calgary has two major divisions: one in health care called ResolutionMD, and another focusing on cloud collaboration and mobility called PureWeb. Specifically, I spoke to them about their innovative collaboration plugin for the Rhino3D CAD tool, called PureWeb Cast, which has launched in beta today.

The PureWeb Cast plugin allows a designer working on a 3D model to share the rendered model with anyone in the world and collaborate in real-time. The remote participants simply click on a link in their browser and immediately they can see the rendered model in their browser, no downloads or installs needed. They can take control, rotate, zoom and annotate as necessary. Adding audio to the experience is straightforward and the result is a powerful and highly targeted collaboration tool.

Under the hood of the Windows SDK

So how is this all working? Since the OpenTok Windows SDK is a native SDK it is very easy to put together screencasting functionality, see the sample code here. Of course the framebuffer you are “casting” does not need to be the screen of your PC, it can be any framebuffer and therein lies the magic. It can be a 3D model like the Rhino3D example, architectural drawings, or it could be a series of graphical overlays. Under the hood, this framebuffer is encoded into a compressed video stream and sent to the OpenTok cloud. From there it can be sent to any OpenTok client (browser, mobile or desktop), making the collaboration seamless.

Key Benefits of Windows SDK

In addition to making screencasting easy, the OpenTok Windows Client SDK has the following benefits:

Tighter integration with the operating system and better performance than web apps
Superior security of a native application
Easy to deploy in enterprise IT environments

For technical specifications, please review our Windows SDK datasheet.

Want more information?

The OpenTok Windows Client SDK API documentation is available from our developer center here and there is a quick tutorial to get started with basic two-party calls here.

The OpenTok Windows Client SDK allows you to build collaboration applications that are tightly integrated with the underlying Windows operating system and deliver compelling collaboration experiences that streamline productivity. We look forward to seeing what you build for Windows using this SDK.

The post How Calgary Scientific Drives Real-Time Collaboration with the Windows SDK appeared first on TokBox Blog.

↧

The Cordova OpenTok Plugin: Live video for iOS and Android applications

November 29, 2017, 11:26 am

≫ Next: Game on! Learn how to add OpenTok live video chat to Unity

≪ Previous: How Calgary Scientific Drives Real-Time Collaboration with the Windows SDK

When I introduced myself at the start of October, I shared my aim of working with TokBox developer community members to maintain and add support for third-party frameworks in order to reach as many developers as possible. I’ve been busy since then speaking to our community and learning about what they need to have a great experience with our OpenTok platform and get creative with live video.

One framework that has come up repeatedly is Cordova. So today, I’m really happy to announce that I’ve updated the Cordova OpenTok Plugin so it’s compatible with the OpenTok iOS and Android 2.12.0 SDKs.

By using the plugin and the Cordova framework, you can add live video chatting capabilities to your mobile applications in JavaScript without having to touch any native code. Please keep in mind that the plugin does not yet support all of the features that are available in the native SDKs. For more details, you can find the documentation regarding the plugin here.

For supporting material, we’ve also created an OpenTok Cordova Samples repository which contains sample code for creating the following applications:

Below, you can see me video chatting with Ernest, one of our cloud engineers, using the OpenTok Cordova Plugin.

Cordova OpenTok plugin

We’ve categorized both the plugin and samples repos as OpenTok Labs projects. The OpenTok Labs label indicates that the plugin is an exploration project, rather than an officially supported repo, so we’d love you to contribute by filing issues and sending pull requests to improve it. For more information on contributing, please visit the repo contribution guidelines.

As you may know, you can also use Cordova plugins for Ionic applications through Ionic Native, a TypeScript wrapper for Cordova Plugins. Knowing that, please keep an eye out for more sample applications on how you can use OpenTok and Ionic together!

The post The Cordova OpenTok Plugin: Live video for iOS and Android applications appeared first on TokBox Blog.

↧

Game on! Learn how to add OpenTok live video chat to Unity

December 13, 2017, 8:24 am

≫ Next: Accelerate App Development with Fast Tracks

≪ Previous: The Cordova OpenTok Plugin: Live video for iOS and Android applications

When talking about game development, there is one name that quickly comes to mind. Unity has become one of the most popular engines that you can use if you plan to develop a game. Its multiplatform capabilities and ease of use makes it a good solution to bring your idea to life.

Like any other type of application, adding live communication features to a game is not a trivial thing. There are plenty of complicated problems to solve. OpenTok comes to the rescue in most scenarios and adding video chat to Unity game development is no exception.

In this blog post we will describe how you can integrate the OpenTok SDK into a Unity-powered game targeted to the Windows platform, using the recently released Opentok Windows SDK.

Along with the blog post, we have published a working sample that implements what we describe in this article. Please refer to it for more details.

Adding live video chat to Unity – the big picture

Seen from a distance, the task of displaying live video in a Unity element looks pretty simple. OpenTok SDK can provide you with a continuous stream of video frames from the session participants. Once you have a video frame, you just need to display it in a Unity element.

The first question could be, “How can I display real-time video in a Unity element?” Unity has support for displaying video clips, but live video is somehow different. We’ll use Unity Textures to draw the frames of the video stream and will assign that texture to a Unity element. More precisely, we’ll use an instance of Texture2D class.

If we want to display around 20 frames per second we need to be able to draw the video frames onto the texture as fast as we can, so we’ll use a low-level Unity native plugin to access the texture using the rendering APIs available in each system. In other words, we’ll use DirectX (9 or 11), or OpenGL to dump the video frames coming from the OpenTok SDK to the Unity texture.

Summarizing, in order to receive frames, we will build a custom OpenTok renderer. Once we have a frame, we will feed the low-level plugin with that frame. In its own lifecycle, the low-level plugin will receive a notification from the Unity side whenever the texture needs to be redrawn (usually at 30fps). It will get the last frame set by the custom renderer, and will draw it using DirectX APIs. See the diagram below for more details.

Add OpenTok video to chat to Unity

Using the OpenTok Windows SDK in Unity

Using OpenTok SDK when doing a Windows application in Unity could not be easier. All you need to do is download the SDK and copy the dll files into your Assets/ folder.

Once the files are there, Unity will automatically recognize that OpenTokNET.dll is a .net assembly. It will allow you to import the OpenTok package and use the whole API.

Since OpenTok for Windows offers a .NET interface, you will be able to create a session, create publisher and all the rest of the OpenTok capabilities from you c# script. In our sample, most of the OpenTok session related code is under OpenTokSession.cs file.

The other important part of the Unity integration is the creation of the custom video render that will receive the video frames from the session, that lives in the VideoRender.cs class. The VideoRender class is an implementation of the Opentok IVideoRender interface.

Using a native plugin in Unity

C/C++ development is usually the best way to tackle some problems. Whether for performance or because you need to have some component running in different platforms, you’ll need to do some native development.

Unity supports native plugins that your project can call by using a managed bridge. For this sample, we used a native plugin in order to get the best performance when drawing the video frames onto the Unity texture.

Our native plugin has the responsibility of receiving the texture references from the Unity side. It also receives the frames to draw, and draws them. Let’s take a look at the main functions of the plugin.

The plugin is a separate project inside the whole solution. It will be built to a dll file that we’ll use later in the Unity project. In order to get more details about building the plugin, please refer to the README file of the sample repo. (Add link to sample)

CreateRenderer

Unity will call this function whenever it needs a new renderer. Internally the plugin can support several renderers. Each renderer is represented by the following structure:

typedef struct UPRenderer {
    HANDLE render_mutex;    
    void *last_frame;
    void *texture;
    int frame_height;
    int frame_width;
    int texture_width, texture_height;
} UPRenderer;

In summary, we need to save a reference to the Unity texture, a reference to the last frame, and the size of both the frames and the texture in order to keep track of changes in any of their sizes. (Live video resolution can change according to things like network conditions or CPU usage).

This function will return an unique ID that the rest of the plugin’s functions will use.

SetRendererTexture

Textures are created and assigned to the target element from Unity side. We will use this function to send the reference of the Texture2D created in the Unity side to the plugin. It will just save its reference inside the Renderer struct

ShouldCreateNewTexture

Since all the video frames could not be the same size, and a Texture2D is created with a fixed resolution, we need to know when the Texture does not fit the received video frame. The Unity project will call this function to know if it needs to create a new one because the size of the frame and the size of the texture is not the same.

SetRenderFrame

As you can tell by its name, this function saves a reference to a video frame. The plugin will use that reference later to draw it on the texture.

OnRenderEvent

Once we have all the elements in place, we need to use our saved references to the texture and the frame to draw them. This function will be called by the render cycle, and will use the Render function to draw the saved frame by using DirectX api. Take a look at the linked sample for more details.

Calling the plugin from the Unity Project

All these functions will be accessed by C# side using a simple bridge taking advantage of Platform invoke methods

public class RenderPlugin
 {
    [DllImport("RenderPlugin", EntryPoint = "CreateRenderer")]
    public static extern int CreateRenderer();

   [DllImport("RenderPlugin", EntryPoint = "SetRendererFrame")]
    public static extern int SetRendererFrame(int rendererId, IntPtr frame,
            int width, int height);
    // …
 }

The Render Loop

Until here, we have described just one side of the story. We have all the OpenTok-related code that will connect to a session, publish your video and subscribe to the session participants. That code will also receive the video frames of each video stream and will send the frames to the native plugin that will, at some point, draw them onto a Unity Texture. But we’re missing a key component: we need to integrate all these in the Unity lifecycle of an application.

You will be familiar with the Unity way of work. Imagine that you have a 3D cube: you usually attach a component which is a subclass of MonoBehaviour to your cube when you want to do something with it. That MonoBehaviour subclass has a Start and an Update method that the Unity engine will call every time it is going to render a new frame. We need to let our native plugin know that it needs to render its saved frame in its saved texture at this point.

Our sample has a class, OpenTokRenderer that will play this role. In the sample, we add this component to the object where we want to render the video stream. In its Update() method, we will tell the plugin that it needs to draw the frame. Unity uses a built in function in order to do so. You can see this in the last line of the Update method shown below:


void Update()
{
  int newWidth = 0, newHeight = 0;
  if (RenderPlugin.ShouldCreateNewTexture(rendererId,
    ref newWidth, ref newHeight) != 0)
  {                     
    texture = new Texture2D(newWidth, newHeight, TextureFormat.BGRA32, false);
    RenderPlugin.SetRendererTexture(rendererId, texture.GetNativeTexturePtr(),
         newWidth, newHeight);
    GetComponent<MeshRenderer>().material.mainTexture = texture;
  }
  GL.IssuePluginEvent(RenderPlugin.GetRenderEventFunc(), rendererId);
}

add Opentok webrtc video to a Unity game

We will add an instance of OpenTokRenderer to each GameObject that will render the video stream of an OpenTok session participant.

Get building with Unity and OpenTok

As we have seen, adding OpenTok live video chat to Unity is perfectly doable. Given the possibility of writing native code, you’ll get the same performance as you get from other usages of OpenTok. For a complete working solution, please take a look at the sample code that we have written to illustrate this blog post. The sample has been written with reusability in mind, so you can use OpenTokRenderer, OpenTokSession classes and the native plugin in your current game or Unity project.

The post Game on! Learn how to add OpenTok live video chat to Unity appeared first on TokBox Blog.

↧

Accelerate App Development with Fast Tracks

December 18, 2017, 4:48 pm

≫ Next: Connecting WebRTC endpoints and Telephony Systems with OpenTok SIP Interconnect

≪ Previous: Game on! Learn how to add OpenTok live video chat to Unity

When you sign up for a new platform, we know it can sometimes be overwhelming. It can be a challenge to know what information to seek out and simply how to quickly get started building your first project to become successful with the platform. At TokBox, we’ve received positive feedback on our developer documentation and resources, and we wanted to further streamline the development process with our platform to accelerate the path to production.

Made for developers, by developers

That’s why we’ve been working hard on our developer onboarding process to make it more customized and efficient. Today, we’re excited to introduce our new custom onboarding flow called Fast Tracks. With Fast Tracks, developers can simply answer a few questions about the app they’re planning to build, including what device they’re building for, how they want users to interact, and any advanced OpenTok features they want to add. We’ll then use this information to automatically generate a custom “track” – a personalized development plan that breaks down the build process step-by-step, complete with the appropriate documentation for each step, so you can take your app from zero to production faster.

Create Custom Track

Advance your progress in OpenTok

Each Fast Track is custom generated based on the information developers give us. In launching Fast Tracks, we’re aiming to see greater adoption with the OpenTok platform by surfacing information that developers need and making it easier for developers to discover the resources they need to be successful. Think of Fast Tracks as an ongoing course you can follow through your OpenTok journey, and with each completion of a step, you’re completion percentage will increase and we’ll offer you more tutorials as you advance.

Initial results highlight immediate impact

We’ve been working with several developers on putting together Fast Tracks and measuring their success. In looking at the early results, we saw a significant decrease in the amount of time it took for developers to find the documentation they were looking for, leading to faster time to market. This has all been part of a larger effort to make it easier for developers to find what they need in our Developer Center and get to production much more quickly.

In looking at an early test of Fast Tracks, we found that users who stream 1000 minutes in their first week increased by 23%. This helps to identify that with the addition of Fast Tracks, developers are finding the resources they need to build with OpenTok much more efficiently, getting apps in production faster.

We’re looking forward to continuing to track the results of Fast Tracks and work with the developer community to enhance their path to production with OpenTok resources.

Ready to start building? Sign up for a free trial today and get $10 of credits towards any OpenTok feature.

The post Accelerate App Development with Fast Tracks appeared first on TokBox Blog.

↧

Connecting WebRTC endpoints and Telephony Systems with OpenTok SIP Interconnect

December 19, 2017, 11:08 am

≫ Next: Chrome 63 kills the macOS WebRTC audio bug

≪ Previous: Accelerate App Development with Fast Tracks

As we continue to work towards enabling developers to reap the full potential of WebRTC, we wanted to demonstrate connecting a WebRTC audio stream with a PSTN user, using OpenTok SIP Interconnect and a third party SIP-PSTN Gateway.

OpenTok SIP Interconnect enables interoperability between WebRTC endpoints and existing telephony systems. This allows users to make SIP-based audio calls, while simultaneously browsing the website or mobile application. The OpenTok SIP Interconnect Gateway is composed of two parts: the SIP Interface and the OpenTok Interface. Below you’ll find a graphic detailing how communication works between a WebRTC app and SIP endpoints:

OpenTok SIP interconnect WebRTC to SIP

There are many benefits of using OpenTok SIP Interconnect, such as integrating with voice powered by WebRTC to offer real-time customer service or using it as a PSTN fallback in case there are connectivity issues or incompatibility between browsers. Additionally, end-users who are offsite can easily communicate by dialing into OpenTok sessions through a SIP-PSTN Gateway. To find out about more use cases, read our Introducing SIP Interconnect blog.

For supporting material, we’ve created an OpenTok-SIP-Samples repo that showcases the following samples:

Please keep in mind that the SIP Interconnect API does not support direct incoming SIP calls. However, you can implement a conferencing solution where a user dialing in from a regular phone (PSTN) and a user dialing-out using SIP Interconnect from an OpenTok Session is bridged by a SIP-PSTN gateway.

If you’re interested in learning more about how OpenTok SIP Interconnect works, please reach out to me. I’d love to know what you are working on!

The post Connecting WebRTC endpoints and Telephony Systems with OpenTok SIP Interconnect appeared first on TokBox Blog.

↧

Chrome 63 kills the macOS WebRTC audio bug

January 19, 2018, 3:25 am

≫ Next: A flexible layout container for multiparty video chat

≪ Previous: Connecting WebRTC endpoints and Telephony Systems with OpenTok SIP Interconnect

Our good friend Philipp Hancke wrote a great post recently on a WebRTC audio bug that has been plaguing Chrome on MacOS for the last few years. The issue presented itself as the microphone not working sometimes in Chrome on MacOS until you completely restart the machine. This seemed to happen after a Mac went to sleep and then woke up again.

The good news is that this is fixed with Chrome 63! Philipp put together a great chart showing the error rates in different versions of Chrome which clearly shows the drop off with Chrome 63.

Daily occurence of macOS Chrome audio bug in Chrome 62 and 63

Issue tracking at TokBox

At TokBox, we have been keeping track of this issue as well. We reported it to our customers last year when it became a big problem. We provided a way to detect when this happens in our JS SDK. We also worked with Google to help to resolve the issue by providing crash dumps and data about our customers experiencing the issue.

We have been tracking instances of this error experienced by our customers and can confirm that Chrome 63 seems to have fixed the issue.

macOS WebRTC audio bug

As you can see from this chart, Chrome#63 failure rate has basically dropped to 0 when compared to other versions. It’s that flat black line at the bottom.

It’s great to see the WebRTC community coming together to report and diagnose these types of issues. Continuing to improve the quality of WebRTC for everyone!

You can find information about bugs currently affecting WebRTC and how we are working to solve them for our customers and the wider community in our Knowledge Base.

The post Chrome 63 kills the macOS WebRTC audio bug appeared first on TokBox Blog.

↧

A flexible layout container for multiparty video chat

January 24, 2018, 3:24 am

≫ Next: Zenbag cruises to victory with OpenTok at CruzHacks

≪ Previous: Chrome 63 kills the macOS WebRTC audio bug

When anyone builds a multiparty video chat application they pretty quickly run into the issue of how to lay out the many different participants. You want everyone to be visible to everyone else and you also want video to take up as much space as possible, without wasted white-space. In the web you have the additional complexity of lots of different display sizes. Participants can be on a mobile device or tablet, or even just resizing their browser window so that they can see something else beside the video chat. For this reason you want a layout algorithm that is responsive.

We have been using a layout container we wrote internally for our applications (eg. meet.tokbox.com). We’ve been gradually iterating on it, adding more and more features and different ways to lay out participants. You can find the layout container at https://github.com/aullman/opentok-layout-js along with instructions for how to easily drop it into any OpenTok application.

You can also see this layout algorithm in action on meet.tokbox.com.

Fixed Aspect Ratio

Our first attempt at solving this problem was to build a layout container that makes all participants the same size. It just assumes that every participant is the same aspect ratio and then it calculates what the optimum layout is.

This algorithm looks at the aspect ratio of the first participant, in this case 16:9 and then makes every other participant that dimension. It then uses a brute force method to determine the best layout. It does this by trying different combinations of rows and columns and finds which one takes up the most space. So for example if there are 4 participants it will try 1 column and 4 rows, 2 columns and 2 rows, 3 columns and 2 rows or 4 columns and 1 row. It will then figure out which of these arrangements takes up the most space inside its container. This way it will actually work no matter what the dimensions of the container is.

A tall container results in 1 column and 4 rows.

Vertical layout container for multiparty video chat

A wide container results in 4 columns and 1 row.

Horizontal layout container for multiparty video chat

Most other sizes result in 2 columns and 2 rows.

2 x 2 layout container for multiparty video chat

You can get this layout by using the layout container and passing in fixedRatio: true like so:

var initLayoutContainer = require('opentok-layout-js');
 var layout = initLayoutContainer(document.getElementById("layout"), {

fixedRatio: true

});
 layout.layout();

Cropping

You probably noticed that you are still often left with some white-space around the videos. We can eliminate some of that white-space by cropping the videos. In a normal face-to-face conversation you don’t need the sides of the video. So if we crop those off we can have more of our layout be the video of our participants faces.

Respecting lots of aspect ratios

Sometimes not all participants have the same aspect ratio. With OpenTok now supporting Safari on iOS there are more and more participants joining from mobile devices. Also with screensharing there are any number of different aspect ratios possible.

There are also times when you don’t want to crop any of the video. For example, when sharing a screen or when there are multiple participants all in front of one camera and you want to be able to see all of them. In these situations you don’t want to crop any of the video because you could lose some important information.

To solve this let’s modify our layout algorithm to respect multiple different aspect ratios.

multiparty video chat flexible layout example

In this example you can see that there are several different aspect ratios being displayed in the container and all of them are being respected. In order to make this algorithm work we looked for some inspiration from photo layout algorithms like the Flickr justified layout. We couldn’t use the Flickr layout directly because it does not work quite the same way. It assumes you are going to be scrolling through a list of photos and so it doesn’t constrain the height. But we can borrow some of the concepts.

Again this is just a variation on the existing layout algorithms. It uses the same algorithm to determine how many rows and columns to use. Then when placing the elements, it just makes sure to respect their native aspect ratios. Then (borrowing from Flickr) if the whole height of the container isn’t taken up it does another pass through of each row to see whether they could be made taller to take up more of the width. This means that some rows can now be taller than others but it also means that we either take up the whole height or the whole width of the allocated space.

Putting it all together

We have put all of these layout algorithms together in meet.tokbox.com with different buttons allowing you to switch between different layouts. In the future it would be interesting to automatically detect when it makes sense to respect aspect ratios and when not to. We could detect faces in the video and if there were more than one make sure that you respect the aspect ratio so you don’t crop participants out.

If you want to get started building a multiparty video chat app with this layout algorithm, sign up here for a trial account with $10 of free credit.

The post A flexible layout container for multiparty video chat appeared first on TokBox Blog.

↧