Home / Issues / 5.2008 / Articulated Narrowcasting for Privacy and Awareness in Multimedia Conferencing Systems and Design for Implementation Within a SIP Framework
Document Actions

HC 2006

Articulated Narrowcasting for Privacy and Awareness in Multimedia Conferencing Systems and Design for Implementation Within a SIP Framework

  1. Mohammad Sabbir Alam University of Aizu
  2. Michael Cohen University of Aizu
  3. Ashir Ahmed Kyushu University

Abstract

This article proposes a new focus of research for multimedia conferencing systems which allows a participant to flexibly select another participant or a group for media transmission. For example, in a traditional conference system, participants voices might by default be shared with all others, but one might want to select a subset of the conference members to send his/her media to or receive media from. We review the concept of narrowcasting, a model for limiting such information streams in a multimedia conference, and describe a design to use existing standard protocols (SIP and SDP) for controlling fine-grained narrowcasting sessions.

  1. published: 2009-01-08

Keywords

1.  Introduction

Multimedia conferencing has been in the research agenda for many years. A traditional conferencing system over the pstn (public switched telephone network) has many features implemented in a centrally controlled conference server. The development of ip technology has brought new media (e.g. video) into conferencing systems. H.323 [ IT03 ] and sip (Session Initiation Protocol) [ RSC02, Joh04 ] are popular protocols for ip-based conferencing systems. Sip, a simpler text-based protocol developed by the ietf (Internet Engineering Task Force), added presence features allowing users to discover the availability of participants and also, with a large extension, to control media transmission and the direction from the endpoints. Although sip was designed for multimedia conferencing systems, only VoIP applications have yet gained popularity in the industry and received priority in the sip design community (working groups). Sipping [ CBP08 ] and xcon [ JRPJ08 ] wgs inside the ietf are considering conferencing frameworks. While sipping is designing a conferencing framework using sip, the xcon system is independent of any signaling protocol. Both conferencing models focus only on centralized conferencing systems, where the signaling and media mixing are handled by a central conference server and centralized media mixer.

However, one may want to control media of a particular participant- e.g., participant P1 wanting to block media from participant P2 or wanting to receive media streams only from participant P3. Controlling such media vectors from an endpoint has been a challenging issue. As a simple example, a user's voice might by default be shared with all others in a conference, but a versatile interface would allow a secret to be shared only with some selected subset of the members. Current commercially-available conference systems do not generally support such features.

Our research introduces a flexible multiparty multimedia user-adjustable conference system, including “narrowcasting” functionality, as an application within the sip framework. A human user wants to distribute attention and availability, and narrowcasting provides a formalization of such presence filters. Narrowcasting systems extend broad- and multicasting systems by allowing media streams to be filtered- for relevancy control, privacy, security, and user interface optimization. As sip was designed for multimedia session control, narrowcasting attributes can be implemented within the existing sip framework. In this article, we propose a design for narrowcasting attributes and consider the feasibility of implementing it in a sip framework.

The rest of this article is structured as follows. Section 2 reviews some background information regarding conferencing. In section 3 is explained our proposal for a sip-based implementation. Section 4 details the call flow of narrowcasting implementation in sip. Finally, the conclusion and ideas for future research are presented in section 5 .

2.  Conventional Conference Architecture and Call Control Limitations

This section discusses a common conference architecture, requirements of a typical conference systems, and limitations of existing systems.

2.1.  Architecture

Conventional conferencing systems can be categorized into three different types, depending upon where media streams from participants are mixed.

Centralized Conferencing
A centralized conference [ SKBR03 ] bridge exists in a centralized model. The conference bridge is a conceptually simple device, consisting of a sip user agent to handle signaling, an rtp mixer to handle the media streams, a conference application layer for authentication, authorization & accounting services, and possibly conference control functions. Participants establish one-to-one media and signaling connections with the bridge. The bridge establishes voice paths between endpoints by collecting input signals and returning summed signals to conferees. Figure 1 illustrates how the media is mixed, (en/de)coded if necessary, and redistributed to participants.

Most current multimedia conferencing systems fall into this category. As permissions are controlled by an administrator (a.k.a. floor controller), end users don't have much access to configuration features.

Figure 1. Media Mixing in a Centralized Conferencing

Figure 1: Media Mixing in a Centralized Conferencing


Decentralized Conferencing
In a decentralized model, signaling control is centralized but media are exchanged between participants without going through a centralized bridge. There is no conference server or central point of control. Decentralized conferencing can be either of two types: full mesh or multicast.

Figure 2. Full Mesh Conferencing

Figure 2: Full Mesh Conferencing


A. Full Mesh Conferencing: A full-duplex media link (Figure 2) can be established between every pair of participants, resulting in a fully-connected mesh. Each endpoint transmits a copy of its stream to the N - 1 other endpoints, and receives N - 1 streams in return, on separate ports. Each pair of participants can communicate through any mutually supported codec type.

B. Multicast Conferencing: In a multicast conference, participants join a session by subscribing to a conference multicast address. This address might be advertised by one of the participants or a central server, or distributed to conferees prior to a conference. Each participant transmits a single copy of his stream to the conference multicast address, receiving N - 1 streams in return. From a receiver perspective, nothing changes from the full mesh arrangement except that the streams arrive on a single port. Multicast conferences can scale up to millions of users and do not really require any sip signaling. However, native multicast is not yet widely available.

2.2.  Requirements of a Flexible Conferencing System

To implement a flexible end-to-end conferencing system, the following considerations apply:

General requirements: A conference control framework should be scalable, extensible, generic, reliable, and secure. The scalability requirement means that the conference control framework must support reasonably large, geographically distributed, conferences. Moreover, it should be extensibly modular so that new components can be easily added or existing components changed. The conference control framework must also be generic so that it is not tied to any particular application. While conference control protocols are likely to consume significantly less bandwidth than media streams, some care needs to be taken for large conferences. Since the conference description and policy information can be massive, incremental updates are preferred to having to resend entire descriptions after each change. Similarly, changes in participant lists should be distributed as additions and removals. Also, not all participants care about the same level of detail; for example, some may only be interested when new members join or leave, but not when a participant adds herself to a floor queue. The importance of reliability and security is obvious.

Session establishment: A mechanism is required to establish connections among multiple participants, to manipulate and describe media “mixing” or “topology” for multiple media types (audio, video, text, position data, etc.). Sip is a good candidate for this purpose. Technical challenges involve flexibly defining the media and its transmission using the sdp (Session Description Protocol) [ HJ98 ].

Network resource management: Network resources are an important factor determining the communication quality of a conference, or “QoS” (quality of service). Conferencing on a best-effort internet is an on-going challenge. Large delay or jitter irritates participants and degrades conference quality. Considering network characteristics and available bandwidth, proper encoding/decoding schemes must be deployed.

Policy: A user rights database specifies the privileges of potential participants. User rights lists might include information about who can authorize the admission or expulsion of participants and who can act on floor control requests. Such functions are often combined into the role of a moderator, but a flexible system should allow them to be distributed among a set of participants.

Security and privacy: Unwelcome participants are excluded, so no unauthorized party may intrude upon or eavesdrop in a conference. A mechanism for membership and authorization control is required. The policy may describe which users are pre-authorized to join (“white list”), are explicitly forbidden from joining (“black list” or “block list”), or may join but in listen-only (“lurk”) mode. Since internet-based signaling protocols offer a variety of authentication mechanisms, a policy might also define at what strength each participant must authenticate. Unauthenticated users may be rejected or relegated to audience status.

2.3.  Related Research

Over the years, there have been many studies in the area of conference control [ KSW02, SNS01 ]. Most earlier works discuss only floor control aspects of conference control. Standardization efforts have met with limited success. H.323, developed by the itut, has several problems, including scalability issues due to insufficient T.124 database replication protocol and its limitation to binary asn.1 format (not text-based) protocol. Sip, in contrast, is a text-based protocol which can easily interact with other internet protocols. Sip is a signaling protocol for creating, modifying, and terminating multimedia sessions between multiple participants. Conferencing is possible using standard sip methods [ RSC02 ], allowing users to join and leave conferences and allowing invitation of other participants. However, sip by itself does not offer configurable conference policies, participant access lists, floor control, or user privilege levels. The sipping (Session Initiation Protocol Project Investigation) [ CBP08 ] wg is chartered to develop requirements for extensions to sip needed for multi-party applications. Xcon, working closely with sipping, focuses on development of a standardized suite of protocols for tightly coupled multimedia conferences [ JRPJ08 ].

A limitation of traditional conferencing systems is that a participant (not a conference administrator) can not control other participants' displays. Current conferencing systems generally do not have capability to select a subset of the conference participants to whom his media are sent or from whom streams are received. In this article, we introduce narrowcasting attributes to implement media restriction features within a sip framework.

3.  Enhancement of Conferencing System: Narrowcasting

In this section, we describe the feature set for narrowcasting in sip-based conferences. In our group's earlier publications [ FCDK05, ACA05, FAD06 ], we introduced the concept of narrowcasting attributes, described functions to apply these features in a standard conferencing model (recapitulated in Figure 6), and proposed how features could be implemented using standard sip methods and headers defined in rfc 3261 [ RSC02 ]. Advantages of such a deployment include the convenience that no new methods or header extensions would be required to implement the features.

3.1.  Narrowcasting Concept

Figure 3 shows a famous Japanese sculpture which is good example of narrowcasting attributes. Three monkeys: Mizaru (the monkey with eyes covered), Iwazaru (mouth covered), and Kikazaru (ears blocked) manifest the notion of limiting media vectors. Mizaru can not see (but can hear and speak); Iwazaru can not speak (but can see and hear); Kikazaru can not hear (but can speak and see).

Figure 3. Media Restriction (Narrowcasting Attribute)

Figure 3: Media Restriction (Narrowcasting Attribute)

In analogy to broad-, multi-, and any-casting, narrowcasting is a technique for limiting and focusing information streams, either sources or sinks (receivers). We employ the paradigm of multiple simultaneous chatspaces, each with several or many conversants and across which one has “multipresence,” permitted designation of multiple instances of one's “self.” The audio windows narrowcasting predicate calculus [ Coh00 ] is an formalization for such a permission scheme. In Table 1, narrowcasting audio attributes are listed and their characteristics explained. This article proposes deployment of these attributes within a sip framework.

Figure 4. A Three-Party Conference Model

Figure 4: A Three-Party Conference Model


Figure 4 shows the initial state of conference in which three participants- P1, P2 and P3 -can talk to and hear each other. In other words, all the participants are in a fully connected media relationship. Our design will allow each user to send or receive data streams to/from a specific set of recipients in a session. For easier understanding, we consider only audio streams in this article. However, this design applies equally well to other media types, including video, text, and data (geographic location, for example).

Table 1.  Proposed Audio Narrowcasting Attributes

Attributes

Description

Mute

blocks the media stream coming from a source. In Table 2(a,b), P1 mutes P2, i.e. P1 blocks the media coming from P2. As a result, P1 does not hear P2. However, P2 can still hear P1.

Select

limits the projected sound to particular sources. In Table 2(c,d), P1 selects P2, i.e. P1 focuses on media coming from P2. As a result, P1 can listen only to P2's voice; P1 can not hear other participants.

Deafen

blocks media streams going to a sink. In Table 2(e,f), P1 deafens P2, i.e. P1 blocks media going towards P2. As a result, P2 can not hear P1. The relationship between P1 and other participants remains the same.

Attend

limits received sound to particular sinks. In Table 2(g,h), P1 attends P2, i.e. media from P1 can go only to P2. As a result, only P2 can hear P1 but others can't.


3.1.1.  Source Functions: Mute and Select

A “mute” function is available in present-day conference systems. However, in most cases, a participant mutes herself by connecting the other conversant to “music on hold.” On-hold parties hear to the music, but no voice media is transmitted. In our definition, a user can explicitly mute another party.

In Table 2(a), three participants participate in a conference in which P2 has been muted by P1. This means P1 doesn't want to hear P2, but only P3. Specifically,

  • P1 has a simplex (one-way) relationship with P2, P1P2.

  • P1 has a duplex (two-way) media relationship with P3, P1P3.

  • P2 has a duplex media relationship with P3, P2P3.

As a result,

  • When P1 speaks, both P2 and P3 will hear.

  • When P2 speaks, only P3 will hear (and NOT P1).

  • When P3 speaks, both P1 and P2 will hear.

Equivalently for this simple example, P3 might be selected by P1. The connectivity matrix of the situation shown in Table 2(a) can be portrayed as

representable in matrix form as

where entry cij of the matrix represents connectivity of source i to sink j, and the main diagonal is populated by “don't care”s.

A scenario with four participants in a session is shown in Table 2(d). Here P2 is selected by P1, so P1 can hear only P2 but not others. Other participants can hear as usual. The connectivity of Table 2(d) is represented as

.

3.1.2.  Sink Functions: Deafen and Attend

Remote deafen is also available in full-functioned conferencing systems as “Listen-only mode.” In most cases, only an end-user or administrator may invoke this feature. In our definition, any user can control the media sent to or received from another.

Table 2. Narrowcasting Models

Control

Mute

Select

Deafen

Attend

P1P2

 

 

 

 

 

 

 

 

 

 

Situation

A participant wants to block media from a specific participant. In this scenario, P1 mutes P2.

A participant wants to receive media only from a particular participant. In this scenario, P1 selects P2.

A participant wants to block media to specific participant(s). In this scenario, P1 deafens P2.

A participant wants to send media to a specific participant. In this scenario, P1 attends P2

Result

P1 has only send-only relationship with P2. Other media vectors remain the same.

Only P1P2 remains same. Other participants have receive-only media relationship withP1.

P1 has a receive-only media relationship with P2. Others remain the same.

Media from P1 only goes to P2. Others only send to P1 but cannot receive media from P1.


In Table 2(e), P2 is deafend by P1. This means P1 doesn't want to send his voice to P2 to hear. Specifically,

  • P1 has a simplex media relationship with P2, P1P2.

  • P1 has a duplex media relationship with P3.

  • P2 has a duplex media relationship with P3.

In this case:

  • When P1 speaks, P3 will hear, but P2 won't.

  • When P2 speaks, both P1 and P3 will hear.

  • When P3 speaks, both P1 and P2 will hear.

Equivalently, P3 might be attended by P1, so that only P3 can hear P1. P1 could still hear all other streams. The connectivity matrix for Table 2(e) is

.

In Table 2(h), P2 is attended by P1. As a result only P2 can hear from P1. The connectivity matrix of this situation is

.

For egalitarian models with flat hierarchies, there is an asymmetry regarding both mute/select and deafen/attend: audibility of a source with respect to a sink is treated as a revocable privilege and a forsakable right. A sink can by default hear collocated sources, adjustable by narrowcasting commands. For example, if P2 attends P1 but P1 has muted P2, P1 won't hear P2. Further policy extensions will extend the permissions of such a protocol, including the ability to force audibility by overriding a source's mute or sink's deafen (which a parent might invoke when telechiding a distracted child: “How dare you mute me?!”). Consideration of such role-based issues will be the focus of future research.

3.2.  Sip for Multimedia Conferencing

Peers in a sip session are called user agents, and can function in the following roles:

User-Agent Client (uac) A client application that initiates a sip request.

User-Agent Server (uas) A server application that contacts the user when a sip request is received and returns a response on behalf of the user.

A sip end-point is capable of functioning as both a uac and a uas, but typically functions as only one or the other per session, depending upon the user agent that initiated the request.

Sip makes use of elements called proxy servers to help route requests to users' current locations, authenticate and authorize users for services, implement provider call-routing policies, and provide features to users. Sip also provides a registration function that allows users to upload their current locations (ip addresses) for use by proxy servers.

3.3.  Session Establishment in sip

A typical hand-shaking exchange is shown in Figure 5, P1 sending an invite request with media capabilities to P2. A 100/trying and a 180/ringing message confirm that P2 is being alerted. A 200/ok message (which might also contain the final session description message body, whose significance will be explained later) is sent once P2 accepts the invite, notifying that a connection has been made. Upon receiving the 200/ok from P2, P1 sends an ack, usually triggered by a human user. A two-party duplex session is established at this point. The delay between the 180/ringing and 200/ok messages depends upon after how many rings the user accepts the call. Participants wishing to leave a session send a bye request within the session dialog [ ACA04 ].

Figure 5. Call Flow of a Typical, Simple sip Session

Figure 5: Call Flow of a Typical, Simple SIP Session

Sip signaling can be transported on either tcp or udp; a standard SIP entity must support both types [ RSC02 ]. For realizing narrowcasting attributes over sip, a client will follow the guideline of rfc 3261 Section 18: If a request is within 200 bytes of the path mtu (maximum transmission unit), or if it is larger than 1300 bytes, or the path mtu is unknown, the request must be sent using an rfc 2914 congestion-controlled transport protocol, such as tcp.

Figure 6. Conferencing Model

Figure 6: Conferencing Model


3.4.  The Conferencing Model

Narrowcasting attributes can be implemented in both centralized and decentralized conferences. This article focuses on a decentralized conference architecture, for which the media is mixed at each end-point. Figure 6 illustrates components of the conferencing system and their roles. We have extended the model being proposed by the ietf with narrowcasting attributes.

Focus: The focus is a sip user agent addressed by a conference uri (uniform resource identifier). It handles sip signaling between participants in a conference. The focus establishes media exchange among participants in a conference, and also implements conference policies. Its logical role is in analogy to that of a controller in a centrally signaling, distributed media architecture.

Participants: User agents are identified by a uri, communicating with each other after having been connected through the focus.

Conference notification service: The focus can act logically as a notifier [ Roa02 ], accepting subscriptions to the conference and notifying subscribers about changes to that state. The state includes the state maintained by the focus itself, the conference policy, and the media policy.

Conference policy server: A conference policy server stores and manipulates rules using an xcap (Extensible Markup Language Configuration Access Protocol) [ Ros07 ] database associated with participation in a conference. These rules include directives on the lifespan of the conference, who can and cannot join it, who can override the media policy, definitions of roles available in the conference, and the responsibilities associated with those roles.

Conference policy: The complete set of rules governing a particular conference is interpreted and enforced by the conference policy server.

4.  Design for Implementation of Narrowcasting Attributes in sip

Implementation of narrowcasting attributes inside sip can be implemented by modifying only the generator of the sdp message body. Section 3.3 described session establishment in sip, where sdp is used to indicate media capabilities and destination addresses.

Media negotiation is part of the invite/200/ack sequence to establish a sip session between two endpoints. Sip itself doesn't provide media negotiation, but it enables media negotiation between user agents using sdp. Each participant sends information via sdp in either an invite or in an ack about her terminal's media capabilities and the transport address at which she wishes to receive rtp packets. In the sdp body attached to the sip header, the user agents specify the media type, codec, ip address, and port number for each media stream. In the message body of the 200/ok response to the invite, the server sends the transport address to which the participant should send his accepted media capabilities rtp packets. Our implementation in sip [ ACA07 ] will use the narrowcasting attributes mute, select, deafen, and attend, along with the media capabilities in the invite/200/ack sequence in the sdp bodies.

Figure 4 showed multiparty voice communication between P1, P2, and P3. Considering the participants' media flow, we propose the protocol elaborated below. In our design we consider the existing standard media session and send a re-invite by modifying the sdp body.

4.1.  Mute

Figure 7 illustrates a scenario in which P1, P2, and P3 are in an rtp media session. If P1 wants to mute P2, P1 sends a re-invite to P2 with a modified sdp attribute, a=sendonly. P2 then responds with 200/ok including a=recvonly along with other sdp attributes. As the negotiation determines to only send media from P1 to P2, a one-way rtp connection is established (P1P2). Thus is P1 muted by P2. The status of other participants (i.e., P3 in this example) remains unchanged. An example of the re-invite/ok handshake in Figure 7 is shown below, where the first block of each log is the sip header and the second block is the sdp body.

Figure 7. Mute Call Flow

Figure 7: Mute Call Flow



  INVITE sip:cohen@voice.u-aizu.ac.jp
  SIP/2.0
  Via: SIP/2.0/UDP 123.456.789.101
  From: sabbir <sip:sabbir@judo.u-aizu.ac.jp>
  To: cohen <sip:cohen@voice.u-aizu.ac.jp>
  Call-ID:627802096@judo.u-aizu.ac.jp
  CSeq: 1 INVITE
  Contact:<sip:sabbir@123.456.789.101>
  Content-type: application/sdp
  Content-Length: 110
  
  v=0
  o=sabbir 2345 3345 IN IP4 judo.u-aizu.ac.jp
  c=IN IP4 123.456.789.101
  m=audio 2410 RTP/AVP 0
  a=sendonly

The 200/ok sequence looks like


 SIP/2.0 200 OK
 Via: SIP/2.0/UDP 123.456.789.101
 From: sabbir<sip:sabbir@judo.u-aizu.ac.jp>
 To: cohen <sip:cohen@voice.u-aizu.ac.jp>;
 tag=659882290
 Call-ID:627802096@1judo.u-aizu.ac.jp
 CSeq: 1 INVITE
 Contact:<sip:cohen@123.456.789.102>
 Content-type: application/sdp
 Content-Length: 110
 
 v=0
 o=sabbir 2345 3345 IN IP4 voice.u-aizu.ac.jp
 c=IN IP4 123.456.789.102
 m=audio 2410 RTP/AVP 0
 a=recvonly

4.2.  Deafen

In order to deafen P2, P1 sends a re-invite to P2 with a modified sdp attribute, a=recvonly. P2 then responds with 200/ok including a=sendonly along with other sdp attributes. As the negotiation determines only to transmit the media from P2 to P1, a simplex media connection is established (P2P1), thereby deafening P2 by P1.

Figure 8. Deafen Call Flow

Figure 8: Deafen Call Flow



  INVITE sip:cohen@voice.u-aizu.ac.jp
  SIP/2.0
  Via: SIP/2.0/UDP 123.456.789.101
  From: sabbir <sip:sabbir@judo.u-aizu.ac.jp>
  To: cohen <sip:cohen@voice.u-aizu.ac.jp>
  Call-ID:627802097@judo.u-aizu.ac.jp
  CSeq: 2 INVITE
  Contact:<sip:sabbir@123.456.789.101>
  Content-type: application/sdp
  Content-Length: 110
  
  v=0
  o=sabbir 2345 3345 IN IP4 judo.u-aizu.ac.jp
  c=IN IP4 123.456.789.101
  m=audio 2410 RTP/AVP 0
  a=recvonly

The 200/ok sequence looks like


 SIP/2.0 200 OK
 Via: SIP/2.0/UDP 123.456.789.101
 From: sabbir<sip:sabbir@judo.u-aizu.ac.jp>
 To: cohen <sip:cohen@voice.u-aizu.ac.jp>;
 tag=659882291
 Call-ID:627802097@1judo.u-aizu.ac.jp
 CSeq: 2 INVITE
 Contact:<sip:cohen@123.456.789.102>
 Content-type: application/sdp
 Content-Length: 110
 
 v=0
 o=sabbir 2345 3345 IN IP4 voice.u-aizu.ac.jp
 c=IN IP4 123.456.789.102
 m=audio 2410 RTP/AVP 0
 a=sendonly

4.3.  Select

In order for P1 to select P2, P1 sends a re-invite to all other participants except for P2 with a modified sdp, a=sendonly, and other participants in the conference respond with 200/ok with a=recvonly along with other sdp attributes. A one-way media connection is established between P1 and other participants excepting P2, so P2 is selected by P1.

Figure 9. Select Call Flow

Figure 9: Select Call Flow



  INVITE sip:ashir@gifu.u-aizu.ac.jp
  SIP/2.0
  Via: SIP/2.0/UDP 123.456.789.101
  From: sabbir <sip:sabbir@judo.u-aizu.ac.jp>
  To: ashir <sip:ashir@gifu.u-aizu.ac.jp>
  Call-ID:627802098@judo.u-aizu.ac.jp
  CSeq: 3 INVITE
  Contact:<sip:sabbir@123.456.789.101>
  Content-type: application/sdp
  Content-Length: 110
  
  v=0
  o=sabbir 2345 3345 IN IP4 judo.u-aizu.ac.jp
  c=IN IP4 123.456.789.101
  m=audio 2410 RTP/AVP 0
  a=sendonly

A 200/ok from P3 returned to P1 confirms the implicit mute.


 SIP/2.0 200 OK
 Via: SIP/2.0/UDP 123.456.789.101
 From: sabbir <sip:sabbir@judo.u-aizu.ac.jp>
 To: ashir <sip:ashir@gifu.u-aizu.ac.jp>;
 tag=659882292
 Call-ID:627802098@1judo.u-aizu.ac.jp
 CSeq: 3 INVITE
 Contact:<sip:ashir@123.456.789.103>
 Content-type: application/sdp
 Content-Length: 110
 v=0
 o=sabbir 2345 3345 IN IP4 sound.u-aizu.ac.jp
 c=IN IP4 123.456.789.103
 m=audio 2410 RTP/AVP 0
 a=recvonly

4.4.  Attend

As illustrated by Figure 10, P1 sends a re-invite to all other participants (except for P2) with a modified sdp attribute, a=recvonly, who respond with 200/ok including a=sendonly along with other sdp attributes. A one-way rtp media connection is thus established with other participants (excepting P2), so P2 is attended by P1.

Figure 10. Attend Call Flow

Figure 10: Attend Call Flow



  INVITE sip:ashir@gifu.u-aizu.ac.jp
  SIP/2.0
  Via: SIP/2.0/UDP 123.456.789.101
  From: sabbir <sip:sabbir@judo.u-aizu.ac.jp>
  To: ashir <sip:ashir@gifu.u-aizu.ac.jp>
  Call-ID:627802099@judo.u-aizu.ac.jp
  CSeq: 4 INVITE
  Contact:<sip:sabbir@123.456.789.101>
  Content-type: application/sdp
  Content-Length: 110
  v=0
  o=sabbir 2345 3345 IN IP4 judo.u-aizu.ac.jp
  c=IN IP4 123.456.789.101
  m=audio 2410 RTP/AVP 0
  a=recvonly

The 200/ok sequence looks like


 SIP/2.0 200 OK
 Via: SIP/2.0/UDP 123.456.789.101
 From: sabbir <sip:sabbir@judo.u-aizu.ac.jp>
 To: ashir <sip:ashir@gifu.u-aizu.ac.jp>;
 tag=659882293
 Call-ID:627802099@1judo.u-aizu.ac.jp
 CSeq: 4 INVITE
 Contact:<sip:ashir@123.456.789.104>
 Content-type: application/sdp
 Content-Length: 110
 v=0
 o=sabbir 2345 3345 IN IP4 sound.u-aizu.ac.jp
 c=IN IP4 123.456.789.104
 m=audio 2410 RTP/AVP 0
 a=sendonly

5.  Conclusion and Future Work

In ordinary conversation, participants generally observe turn-taking, as in a cdma (collision detection, multiple access) protocol with discretionary backup. That is, an utterance that collides with another will cause one or both of the simultaneous speakers to stop and wait until a break before repeating.

One might wonder what happens to such conversational turn-taking in the presence of asymmetric media filters and the absence of a moderator. Narrowcasting features - like blocklists, side channels, and call-within-a-call - complicate teleconferences, since a deafened conversant might not be aware that another is talking and multiple sources might speak at once. If some avatars in a conference are muted or deafened to some other participants, without formal floor control there is a danger of some “talking on top of” others. In the absence of common floor control, won't private chats and decentralized control lead to anarchy? Without “traffic signals,” how can collisions be avoided?

In fact, such parallel conversation streams are not a problem. For example, if two participants set up a private side-conference using narrowcasting commands, even though their utterances might collide with others', they wouldn't expect or want others to stop conversing. Rather they “listen with one ear” to ongoing conversations while enjoying their own caucus. Listeners can still untangle conversational threads, by context, voice quality, etc. Just as in real social contexts, including informal gatherings like parties, multiple simultaneous speakers are analyzable. Even “linear” conversations like formal meetings might have some subsets of conversants whispering among themselves while a main speaker is talking. Narrowcasting interfaces will be even more useful when extended by spatial audio and attenuation based on mutual virtual position (source projection, sink bearing, and distance), distributing the respective voices across a soundscape.

The status of each participant's privacy in terms of the media relationship with other participants requires consideration. In this article, we have introduced a design of new features for multimedia conferencing systems. These features could provide enhanced conference functions at the user end, “the edge of the network,” rather than at the server. As a result, a conference participant (not an administrator) could easily control media transmission. We also described the design of these features and method of implementation within the standard sip framework.

Future challenges include developing an algorithm for role-based policy, and adaptive media-mixing at a centralized media mixer for subscribed users.

Bibliography

[ACA04] Mohammad Sabbir Alam Michael Cohen, and Ashir Ahmed A Case Study of VoIP Performance Across Different Networks Proc. icece: 3rd Int. Conf. on Electrical & Computer Engineering (Dhaka),  December 2004pp. 295—298isbn 984-32-1804-4.

[ACA05] Mohammad Sabbir Alam Michael Cohen, and Ashir Ahmed Design of Narrowcasting Implementation in Sip Proc. HC-2005: Eighth Int. Conf. on Human and Computer (Aizu-Wakamatsu),  August 2005pp. 255—260.

[ACA07] Mohammad Sabbir Alam Michael Cohen, and Ashir Ahmed Narrowcasting: Controlling Media Privacy in Sip Multimedia Conferencing 4th ieee Consumer Communications and Networking Conference ccnc 2007 (Las Vegas),  January 2007,  pp. 110—115,  isbn 1-4244-0667-6.

[CBP08] Gonzalo Camarillo Mary Barnes Jon Peterson Cullen Jennings, and Oscar Novo Session Initiation Proposal Investigation (Sipping)2008www.ietf.org/html.charters/sipping-charter.htmlLast Accessed July 11th, 2008.

[Coh00] Michael Cohen Exclude and include for audio sources and sinks: Analogs of mute & solo are deafen & attend Presence: Teleoperators and Virtual Environments,  9 (2000)no. 184—96issn 1054-7460.

[FAD06] Owen Noel Newton Fernando Kazuya Adachi Uresh Duminduwardena Makoto Kawaguchi, and Michael Cohen Audio Narrowcasting and Privacy for Multipresent Avatars on Workstations and Mobile Phones ieice Trans. on Information and Systems E89-D,  (2006)no. 173—87issn 0916-8532.

[FCDK05] Owen Noel Newton Fernando Michael Cohen Uresh Chanaka Duminduwardena, and Makoto Kawaguchi Duplex narrowcasting operations for multipresent groupware avatars on mobile devices ijwmc: Int. J. of Wireless and Mobile Computing,  1 (2005)no. 5Special Issue on Mobile Multimedia Systems and Applications, issn 1741-1084.

[HJ08] M. Handely and V. Jacobson rfc 2327 sdp: Session Description Protocol1998www.ietf.org/rfc/rfc2327.txtLast Accessed July 11th, 2008.

[IT03] ITU-T itu-t Recommendation H.323 (07/2003): Packetbased Multimedia Communications Systems2003http://www.itu.int/rec/T-REC-H.323-200307-S/enSeries H: Audiovisual and multimedia systemsLast Accessed July 11th, 2008.

[Joh04] Alan B. Johnston Sip: Understanding the Session Initiation Protocol Artech House London2004isbn 1580531687.

[JRPJ08] Alan Johnston Adam Roach Jon Peterson, and Cullen Jennings Centralized Conferencing (xcon)2008www.ietf.org/html.charters/xcon-charter.htmlLast Accessed July 11th, 2008.

[KSW02] Petri Koskelainen Henning Schulzrinne, and Xiaotao Wu A sip-based Conference Control Framework nossdav '02: Proc. 12th Int. Wkshp. on Network and Operating Systems Support for Digital Audio and Video New York, NY ACM Press 2002pp. 53—61isbn 1-58113-512-2.

[Roa08] A. B. Roach rfc 3265 - Session Initiation Protocol (sip) Specific Event Notification2002www.ietf.org/rfc/rfc3265.txtLast Accessed July 11th, 2008.

[Ros07] J. Rosenberg rfc 4825: The Extensible Markup Language (XML) Configuration Access Protocol (xcap)may2007www.ietf.org/rfc/rfc4825.txtLast Accessed July 11th, 2008.

[RSC08] J. Rosenberg H. Schulzrinne G. Camarillo A. Johnston J. Peterson R. Sparks M. Handley, and E. Schooler rfc 3261: sip: Session Initiation Protocol 2002www.ietf.org/rfc/rfc3261.txtLast Accessed July 11th, 2008.

[SKBR03] Paxton J. Smith Peter Kabal Maier L. Blostein, and Rafi Rabipour Tandem-Free VoIP Conferencing: A Bridge to Next-Generation Networks ieee Communications Magazine,  41 (2003)no. 5136—145issn 0163-6804.

[SNS01] Kundan Singh Gautam Nair, and Henning Schulzrinne Centralized Conferencing using sip Proc. Internet Telephony Workshop,  April 2001New York.















































































Fulltext

License

Any party may pass on this Work by electronic means and make it available for download under the terms and conditions of the Digital Peer Publishing License. The text of the license may be accessed and retrieved at http://www.dipp.nrw.de/lizenzen/dppl/dppl/DPPL_v2_en_06-2004.html.

Language
  1. Deutsch
  2. English
Navigation