Articulated Narrowcasting for Privacy and Awareness in Multimedia Conferencing Systems and Design for Implementation Within a SIP Framework

Sabbir Alam, Mohammad; Cohen, Michael; Ahmed, Ashir

Startseite / Issues / 5.2008 / Articulated Narrowcasting for Privacy and Awareness in Multimedia Conferencing Systems and Design for Implementation Within a SIP Framework

Artikelaktionen

HC 2006

Articulated Narrowcasting for Privacy and Awareness in Multimedia Conferencing Systems and Design for Implementation Within a SIP Framework

Mohammad Sabbir Alam University of Aizu
Michael Cohen University of Aizu
Ashir Ahmed Kyushu University

Zusammenfassung

This article proposes a new focus of research for multimedia conferencing systems which allows a participant to flexibly select another participant or a group for media transmission. For example, in a traditional conference system, participants voices might by default be shared with all others, but one might want to select a subset of the conference members to send his/her media to or receive media from. We review the concept of narrowcasting, a model for limiting such information streams in a multimedia conference, and describe a design to use existing standard protocols (SIP and SDP) for controlling fine-grained narrowcasting sessions.

veröffentlicht: 08.01.2009

Keywords

DOI: 10.20385/1860-2037/5.2008.14
URN: urn:nbn:de:0009-6-14724
swd:
- 4631612-7
- 4011134-9

Articulated Narrowcasting for Privacy and Awareness
in Multimedia Conferencing Systems
and Design for Implementation Within a SIP Framework

[PDF] [BIBTEX] [Reference]

Mohammad Sabbir Alam*, Michael Cohen†, and Ashir Ahmed‡

*R&D Dept., Mobile Technika Inc.
Shinjuku, Tokyo 162-0845
email: sabbir@mobiletechnika.jp

†Spatial Media Group, University of Aizu
Aizu-Wakamatsu, Fukushima-ken 965-8580; Japan
e-mail: {fd8062102,mcohen}@u-aizu.ac.jp

‡Dept. of CSCE, Kyushu University
744 Moto'oka, Fukuoka 819-0395; Japan
email: ashir@c.csce.kyushu-u.ac.jp

First presented at the International Conference of Human and Computer HC 2006,
extended and revised for JVRB

urn:nbn:de:0009-6-14724

Abstract

This article proposes a new focus of research for multimedia conferencing systems which allows a participant to flexibly select another participant or a group for media transmission. For example, in a traditional conference system, participants' voices might by default be shared with all others, but one might want to select a subset of the conference members to send his/her media to or receive media from. We review the concept of narrowcasting, a model for limiting such information streams in a multimedia conference, and describe a design to use existing standard protocols (sip and sdp) for controlling fine-grained narrowcasting sessions.

Keywords: Narrowcasting, sip, sdp, conferencing, device capability, media direction control, privacy and awareness

Subjects: Session Initiation Protocol, Privacy

1. Introduction

Multimedia conferencing has been in the research agenda for many years. A traditional conferencing system over the pstn (public switched telephone network) has many features implemented in a centrally controlled conference server. The development of ip technology has brought new media (e.g. video) into conferencing systems. H.323 [ IT03 ] and sip (Session Initiation Protocol) [ RSC02, Joh04 ] are popular protocols for ip-based conferencing systems. Sip, a simpler text-based protocol developed by the ietf (Internet Engineering Task Force), added presence features allowing users to discover the availability of participants and also, with a large extension, to control media transmission and the direction from the endpoints. Although sip was designed for multimedia conferencing systems, only VoIP applications have yet gained popularity in the industry and received priority in the sip design community (working groups). Sipping [ CBP08 ] and xcon [ JRPJ08 ] wgs inside the ietf are considering conferencing frameworks. While sipping is designing a conferencing framework using sip, the xcon system is independent of any signaling protocol. Both conferencing models focus only on centralized conferencing systems, where the signaling and media mixing are handled by a central conference server and centralized media mixer.

However, one may want to control media of a particular participant- e.g., participant P₁ wanting to block media from participant P₂ or wanting to receive media streams only from participant P₃. Controlling such media vectors from an endpoint has been a challenging issue. As a simple example, a user's voice might by default be shared with all others in a conference, but a versatile interface would allow a secret to be shared only with some selected subset of the members. Current commercially-available conference systems do not generally support such features.

Our research introduces a flexible multiparty multimedia user-adjustable conference system, including “narrowcasting” functionality, as an application within the sip framework. A human user wants to distribute attention and availability, and narrowcasting provides a formalization of such presence filters. Narrowcasting systems extend broad- and multicasting systems by allowing media streams to be filtered- for relevancy control, privacy, security, and user interface optimization. As sip was designed for multimedia session control, narrowcasting attributes can be implemented within the existing sip framework. In this article, we propose a design for narrowcasting attributes and consider the feasibility of implementing it in a sip framework.

The rest of this article is structured as follows. Section 2 reviews some background information regarding conferencing. In section 3 is explained our proposal for a sip-based implementation. Section 4 details the call flow of narrowcasting implementation in sip. Finally, the conclusion and ideas for future research are presented in section 5 .

2. Conventional Conference Architecture and Call Control Limitations

This section discusses a common conference architecture, requirements of a typical conference systems, and limitations of existing systems.

2.1. Architecture

Conventional conferencing systems can be categorized into three different types, depending upon where media streams from participants are mixed.

Centralized Conferencing
A centralized conference [ SKBR03 ] bridge exists in a centralized model. The conference bridge is a conceptually simple device, consisting of a sip user agent to handle signaling, an rtp mixer to handle the media streams, a conference application layer for authentication, authorization & accounting services, and possibly conference control functions. Participants establish one-to-one media and signaling connections with the bridge. The bridge establishes voice paths between endpoints by collecting input signals and returning summed signals to conferees. Figure 1 illustrates how the media is mixed, (en/de)coded if necessary, and redistributed to participants.

Most current multimedia conferencing systems fall into this category. As permissions are controlled by an administrator (a.k.a. floor controller), end users don't have much access to configuration features.

Figure 1. Media Mixing in a Centralized Conferencing

Decentralized Conferencing
In a decentralized model, signaling control is centralized but media are exchanged between participants without going through a centralized bridge. There is no conference server or central point of control. Decentralized conferencing can be either of two types: full mesh or multicast.

Figure 2. Full Mesh Conferencing

A. Full Mesh Conferencing: A full-duplex media link (Figure 2) can be established between every pair of participants, resulting in a fully-connected mesh. Each endpoint transmits a copy of its stream to the N - 1 other endpoints, and receives N - 1 streams in return, on separate ports. Each pair of participants can communicate through any mutually supported codec type.

B. Multicast Conferencing: In a multicast conference, participants join a session by subscribing to a conference multicast address. This address might be advertised by one of the participants or a central server, or distributed to conferees prior to a conference. Each participant transmits a single copy of his stream to the conference multicast address, receiving N - 1 streams in return. From a receiver perspective, nothing changes from the full mesh arrangement except that the streams arrive on a single port. Multicast conferences can scale up to millions of users and do not really require any sip signaling. However, native multicast is not yet widely available.

2.2. Requirements of a Flexible Conferencing System

To implement a flexible end-to-end conferencing system, the following considerations apply:

General requirements: A conference control framework should be scalable, extensible, generic, reliable, and secure. The scalability requirement means that the conference control framework must support reasonably large, geographically distributed, conferences. Moreover, it should be extensibly modular so that new components can be easily added or existing components changed. The conference control framework must also be generic so that it is not tied to any particular application. While conference control protocols are likely to consume significantly less bandwidth than media streams, some care needs to be taken for large conferences. Since the conference description and policy information can be massive, incremental updates are preferred to having to resend entire descriptions after each change. Similarly, changes in participant lists should be distributed as additions and removals. Also, not all participants care about the same level of detail; for example, some may only be interested when new members join or leave, but not when a participant adds herself to a floor queue. The importance of reliability and security is obvious.

Session establishment: A mechanism is required to establish connections among multiple participants, to manipulate and describe media “mixing” or “topology” for multiple media types (audio, video, text, position data, etc.). Sip is a good candidate for this purpose. Technical challenges involve flexibly defining the media and its transmission using the sdp (Session Description Protocol) [ HJ98 ].

Network resource management: Network resources are an important factor determining the communication quality of a conference, or “QoS” (quality of service). Conferencing on a best-effort internet is an on-going challenge. Large delay or jitter irritates participants and degrades conference quality. Considering network characteristics and available bandwidth, proper encoding/decoding schemes must be deployed.

Policy: A user rights database specifies the privileges of potential participants. User rights lists might include information about who can authorize the admission or expulsion of participants and who can act on floor control requests. Such functions are often combined into the role of a moderator, but a flexible system should allow them to be distributed among a set of participants.

Security and privacy: Unwelcome participants are excluded, so no unauthorized party may intrude upon or eavesdrop in a conference. A mechanism for membership and authorization control is required. The policy may describe which users are pre-authorized to join (“white list”), are explicitly forbidden from joining (“black list” or “block list”), or may join but in listen-only (“lurk”) mode. Since internet-based signaling protocols offer a variety of authentication mechanisms, a policy might also define at what strength each participant must authenticate. Unauthenticated users may be rejected or relegated to audience status.

2.3. Related Research

Over the years, there have been many studies in the area of conference control [ KSW02, SNS01 ]. Most earlier works discuss only floor control aspects of conference control. Standardization efforts have met with limited success. H.323, developed by the itut, has several problems, including scalability issues due to insufficient T.124 database replication protocol and its limitation to binary asn.1 format (not text-based) protocol. Sip, in contrast, is a text-based protocol which can easily interact with other internet protocols. Sip is a signaling protocol for creating, modifying, and terminating multimedia sessions between multiple participants. Conferencing is possible using standard sip methods [ RSC02 ], allowing users to join and leave conferences and allowing invitation of other participants. However, sip by itself does not offer configurable conference policies, participant access lists, floor control, or user privilege levels. The sipping (Session Initiation Protocol Project Investigation) [ CBP08 ] wg is chartered to develop requirements for extensions to sip needed for multi-party applications. Xcon, working closely with sipping, focuses on development of a standardized suite of protocols for tightly coupled multimedia conferences [ JRPJ08 ].

A limitation of traditional conferencing systems is that a participant (not a conference administrator) can not control other participants' displays. Current conferencing systems generally do not have capability to select a subset of the conference participants to whom his media are sent or from whom streams are received. In this article, we introduce narrowcasting attributes to implement media restriction features within a sip framework.

3. Enhancement of Conferencing System: Narrowcasting

In this section, we describe the feature set for narrowcasting in sip-based conferences. In our group's earlier publications [ FCDK05, ACA05, FAD06 ], we introduced the concept of narrowcasting attributes, described functions to apply these features in a standard conferencing model (recapitulated in Figure 6), and proposed how features could be implemented using standard sip methods and headers defined in rfc 3261 [ RSC02 ]. Advantages of such a deployment include the convenience that no new methods or header extensions would be required to implement the features.

3.1. Narrowcasting Concept

Figure 3 shows a famous Japanese sculpture which is good example of narrowcasting attributes. Three monkeys: Mizaru (the monkey with eyes covered), Iwazaru (mouth covered), and Kikazaru (ears blocked) manifest the notion of limiting media vectors. Mizaru can not see (but can hear and speak); Iwazaru can not speak (but can see and hear); Kikazaru can not hear (but can speak and see).

Figure 3. Media Restriction (Narrowcasting Attribute)

In analogy to broad-, multi-, and any-casting, narrowcasting is a technique for limiting and focusing information streams, either sources or sinks (receivers). We employ the paradigm of multiple simultaneous chatspaces, each with several or many conversants and across which one has “multipresence,” permitted designation of multiple instances of one's “self.” The audio windows narrowcasting predicate calculus [ Coh00 ] is an formalization for such a permission scheme. In Table 1, narrowcasting audio attributes are listed and their characteristics explained. This article proposes deployment of these attributes within a sip framework.

Figure 4. A Three-Party Conference Model

Figure 4 shows the initial state of conference in which three participants- P₁, P₂ and P₃ -can talk to and hear each other. In other words, all the participants are in a fully connected media relationship. Our design will allow each user to send or receive data streams to/from a specific set of recipients in a session. For easier understanding, we consider only audio streams in this article. However, this design applies equally well to other media types, including video, text, and data (geographic location, for example).

Table 1. Proposed Audio Narrowcasting Attributes

Attributes	Description
Mute	blocks the media stream coming from a source. In Table 2(a,b), P₁ mutes P₂, i.e. P₁ blocks the media coming from P₂. As a result, P₁ does not hear P₂. However, P₂ can still hear P₁.
Select	limits the projected sound to particular sources. In Table 2(c,d), P₁ selects P₂, i.e. P₁ focuses on media coming from P₂. As a result, P₁ can listen only to P₂'s voice; P₁ can not hear other participants.
Deafen	blocks media streams going to a sink. In Table 2(e,f), P₁ deafens P₂, i.e. P₁ blocks media going towards P₂. As a result, P₂ can not hear P₁. The relationship between P₁ and other participants remains the same.
Attend	limits received sound to particular sinks. In Table 2(g,h), P₁ attends P₂, i.e. media from P₁ can go only to P₂. As a result, only P₂ can hear P₁ but others can't.

3.1.1. Source Functions: Mute and Select

A “mute” function is available in present-day conference systems. However, in most cases, a participant mutes herself by connecting the other conversant to “music on hold.” On-hold parties hear to the music, but no voice media is transmitted. In our definition, a user can explicitly mute another party.

In Table 2(a), three participants participate in a conference in which P₂ has been muted by P₁. This means P₁ doesn't want to hear P₂, but only P₃. Specifically,

P₁ has a simplex (one-way) relationship with P₂, P₁ → P₂.
P₁ has a duplex (two-way) media relationship with P₃, P₁ ↔ P₃.
P₂ has a duplex media relationship with P₃, P₂ ↔ P₃.

As a result,

When P₁ speaks, both P₂ and P₃ will hear.
When P₂ speaks, only P₃ will hear (and NOT P₁).
When P₃ speaks, both P₁ and P₂ will hear.

Equivalently for this simple example, P₃ might be selected by P₁. The connectivity matrix of the situation shown in Table 2(a) can be portrayed as

representable in matrix form as

where entry c_ij of the matrix represents connectivity of source i to sink j, and the main diagonal is populated by “don't care”s.

A scenario with four participants in a session is shown in Table 2(d). Here P₂ is selected by P₁, so P₁ can hear only P₂ but not others. Other participants can hear as usual. The connectivity of Table 2(d) is represented as

3.1.2. Sink Functions: Deafen and Attend

Remote deafen is also available in full-functioned conferencing systems as “Listen-only mode.” In most cases, only an end-user or administrator may invoke this feature. In our definition, any user can control the media sent to or received from another.

Table 2. Narrowcasting Models

Control	Mute	Select	Deafen	Attend
P₁ → P₂
P₁ → P₂
Situation	A participant wants to block media from a specific participant. In this scenario, P₁ mutes P₂.	A participant wants to receive media only from a particular participant. In this scenario, P₁ selects P₂.	A participant wants to block media to specific participant(s). In this scenario, P₁ deafens P₂.	A participant wants to send media to a specific participant. In this scenario, P₁ attends P₂
Result	P₁ has only send-only relationship with P₂. Other media vectors remain the same.	Only P₁ ↔ P₂ remains same. Other participants have receive-only media relationship withP₁.	P₁ has a receive-only media relationship with P₂. Others remain the same.	Media from P₁ only goes to P₂. Others only send to P₁ but cannot receive media from P₁.

In Table 2(e), P₂ is deafend by P₁. This means P₁ doesn't want to send his voice to P₂ to hear. Specifically,

P₁ has a simplex media relationship with P₂, P₁ ← P₂.
P₁ has a duplex media relationship with P₃.
P₂ has a duplex media relationship with P₃.

In this case:

When P₁ speaks, P₃ will hear, but P₂ won't.
When P₂ speaks, both P₁ and P₃ will hear.
When P₃ speaks, both P₁ and P₂ will hear.

Equivalently, P₃ might be attended by P₁, so that only P₃ can hear P₁. P₁ could still hear all other streams. The connectivity matrix for Table 2(e) is

In Table 2(h), P₂ is attended by P₁. As a result only P₂ can hear from P₁. The connectivity matrix of this situation is

For egalitarian models with flat hierarchies, there is an asymmetry regarding both mute/select and deafen/attend: audibility of a source with respect to a sink is treated as a revocable privilege and a forsakable right. A sink can by default hear collocated sources, adjustable by narrowcasting commands. For example, if P₂ attends P₁ but P₁ has muted P₂, P₁ won't hear P₂. Further policy extensions will extend the permissions of such a protocol, including the ability to force audibility by overriding a source's mute or sink's deafen (which a parent might invoke when telechiding a distracted child: “How dare you mute me?!”). Consideration of such role-based issues will be the focus of future research.

3.2. Sip for Multimedia Conferencing

Peers in a sip session are called user agents, and can function in the following roles:

User-Agent Client (uac) A client application that initiates a sip request.

User-Agent Server (uas) A server application that contacts the user when a sip request is received and returns a response on behalf of the user.

A sip end-point is capable of functioning as both a uac and a uas, but typically functions as only one or the other per session, depending upon the user agent that initiated the request.

Sip makes use of elements called proxy servers to help route requests to users' current locations, authenticate and authorize users for services, implement provider call-routing policies, and provide features to users. Sip also provides a registration function that allows users to upload their current locations (ip addresses) for use by proxy servers.

3.3. Session Establishment in sip

A typical hand-shaking exchange is shown in Figure 5, P₁ sending an invite request with media capabilities to P₂. A 100/trying and a 180/ringing message confirm that P₂ is being alerted. A 200/ok message (which might also contain the final session description message body, whose significance will be explained later) is sent once P₂ accepts the invite, notifying that a connection has been made. Upon receiving the 200/ok from P₂, P₁ sends an ack, usually triggered by a human user. A two-party duplex session is established at this point. The delay between the 180/ringing and 200/ok messages depends upon after how many rings the user accepts the call. Participants wishing to leave a session send a bye request within the session dialog [ ACA04 ].

Figure 5. Call Flow of a Typical, Simple sip Session

Sip signaling can be transported on either tcp or udp; a standard SIP entity must support both types [ RSC02 ]. For realizing narrowcasting attributes over sip, a client will follow the guideline of rfc 3261 Section 18: If a request is within 200 bytes of the path mtu (maximum transmission unit), or if it is larger than 1300 bytes, or the path mtu is unknown, the request must be sent using an rfc 2914 congestion-controlled transport protocol, such as tcp.

Figure 6. Conferencing Model

3.4. The Conferencing Model

Narrowcasting attributes can be implemented in both centralized and decentralized conferences. This article focuses on a decentralized conference architecture, for which the media is mixed at each end-point. Figure 6 illustrates components of the conferencing system and their roles. We have extended the model being proposed by the ietf with narrowcasting attributes.

Focus: The focus is a sip user agent addressed by a conference uri (uniform resource identifier). It handles sip signaling between participants in a conference. The focus establishes media exchange among participants in a conference, and also implements conference policies. Its logical role is in analogy to that of a controller in a centrally signaling, distributed media architecture.

Participants: User agents are identified by a uri, communicating with each other after having been connected through the focus.

Conference notification service: The focus can act logically as a notifier [ Roa02 ], accepting subscriptions to the conference and notifying subscribers about changes to that state. The state includes the state maintained by the focus itself, the conference policy, and the media policy.

Conference policy server: A conference policy server stores and manipulates rules using an xcap (Extensible Markup Language Configuration Access Protocol) [ Ros07 ] database associated with participation in a conference. These rules include directives on the lifespan of the conference, who can and cannot join it, who can override the media policy, definitions of roles available in the conference, and the responsibilities associated with those roles.

Conference policy: The complete set of rules governing a particular conference is interpreted and enforced by the conference policy server.

4. Design for Implementation of Narrowcasting Attributes in sip

Implementation of narrowcasting attributes inside sip can be implemented by modifying only the generator of the sdp message body. Section 3.3 described session establishment in sip, where sdp is used to indicate media capabilities and destination addresses.

Media negotiation is part of the invite/200/ack sequence to establish a sip session between two endpoints. Sip itself doesn't provide media negotiation, but it enables media negotiation between user agents using sdp. Each participant sends information via sdp in either an invite or in an ack about her terminal's media capabilities and the transport address at which she wishes to receive rtp packets. In the sdp body attached to the sip header, the user agents specify the media type, codec, ip address, and port number for each media stream. In the message body of the 200/ok response to the invite, the server sends the transport address to which the participant should send his accepted media capabilities rtp packets. Our implementation in sip [ ACA07 ] will use the narrowcasting attributes mute, select, deafen, and attend, along with the media capabilities in the invite/200/ack sequence in the sdp bodies.

Figure 4 showed multiparty voice communication between P₁, P₂, and P₃. Considering the participants' media flow, we propose the protocol elaborated below. In our design we consider the existing standard media session and send a re-invite by modifying the sdp body.

4.1. Mute

Figure 7 illustrates a scenario in which P₁, P₂, and P₃ are in an rtp media session. If P₁ wants to mute P₂, P₁ sends a re-invite to P₂ with a modified sdp attribute, a=sendonly. P₂ then responds with 200/ok including a=recvonly along with other sdp attributes. As the negotiation determines to only send media from P₁ to P₂, a one-way rtp connection is established (P₁ → P₂). Thus is P₁ muted by P₂. The status of other participants (i.e., P₃ in this example) remains unchanged. An example of the re-invite/ok handshake in Figure 7 is shown below, where the first block of each log is the sip header and the second block is the sdp body.

Figure 7. Mute Call Flow

  INVITE sip:cohen@voice.u-aizu.ac.jp
  SIP/2.0
  Via: SIP/2.0/UDP 123.456.789.101
  From: sabbir <sip:sabbir@judo.u-aizu.ac.jp>
  To: cohen <sip:cohen@voice.u-aizu.ac.jp>
  Call-ID:627802096@judo.u-aizu.ac.jp
  CSeq: 1 INVITE
  Contact:<sip:sabbir@123.456.789.101>
  Content-type: application/sdp
  Content-Length: 110

  v=0
  o=sabbir 2345 3345 IN IP4 judo.u-aizu.ac.jp
  c=IN IP4 123.456.789.101
  m=audio 2410 RTP/AVP 0
  a=sendonly

The 200/ok sequence looks like

SIP/2.0 200 OK
Via: SIP/2.0/UDP 123.456.789.101
From: sabbir<sip:sabbir@judo.u-aizu.ac.jp>
To: cohen <sip:cohen@voice.u-aizu.ac.jp>;
tag=659882290
Call-ID:627802096@1judo.u-aizu.ac.jp
CSeq: 1 INVITE
Contact:<sip:cohen@123.456.789.102>
Content-type: application/sdp
Content-Length: 110

v=0
o=sabbir 2345 3345 IN IP4 voice.u-aizu.ac.jp
c=IN IP4 123.456.789.102
m=audio 2410 RTP/AVP 0
a=recvonly

4.2. Deafen

In order to deafen P₂, P₁ sends a re-invite to P₂ with a modified sdp attribute, a=recvonly. P₂ then responds with 200/ok including a=sendonly along with other sdp attributes. As the negotiation determines only to transmit the media from P₂ to P₁, a simplex media connection is established (P₂ → P₁), thereby deafening P₂ by P₁.

Figure 8. Deafen Call Flow

  INVITE sip:cohen@voice.u-aizu.ac.jp
  SIP/2.0
  Via: SIP/2.0/UDP 123.456.789.101
  From: sabbir <sip:sabbir@judo.u-aizu.ac.jp>
  To: cohen <sip:cohen@voice.u-aizu.ac.jp>
  Call-ID:627802097@judo.u-aizu.ac.jp
  CSeq: 2 INVITE
  Contact:<sip:sabbir@123.456.789.101>
  Content-type: application/sdp
  Content-Length: 110

  v=0
  o=sabbir 2345 3345 IN IP4 judo.u-aizu.ac.jp
  c=IN IP4 123.456.789.101
  m=audio 2410 RTP/AVP 0
  a=recvonly

The 200/ok sequence looks like

SIP/2.0 200 OK
Via: SIP/2.0/UDP 123.456.789.101
From: sabbir<sip:sabbir@judo.u-aizu.ac.jp>
To: cohen <sip:cohen@voice.u-aizu.ac.jp>;
tag=659882291
Call-ID:627802097@1judo.u-aizu.ac.jp
CSeq: 2 INVITE
Contact:<sip:cohen@123.456.789.102>
Content-type: application/sdp
Content-Length: 110

v=0
o=sabbir 2345 3345 IN IP4 voice.u-aizu.ac.jp
c=IN IP4 123.456.789.102
m=audio 2410 RTP/AVP 0
a=sendonly

4.3. Select

In order for P₁ to select P₂, P₁ sends a re-invite to all other participants except for P₂ with a modified sdp, a=sendonly, and other participants in the conference respond with 200/ok with a=recvonly along with other sdp attributes. A one-way media connection is established between P₁ and other participants excepting P₂, so P₂ is selected by P₁.

Figure 9. Select Call Flow

  INVITE sip:ashir@gifu.u-aizu.ac.jp
  SIP/2.0
  Via: SIP/2.0/UDP 123.456.789.101
  From: sabbir <sip:sabbir@judo.u-aizu.ac.jp>
  To: ashir <sip:ashir@gifu.u-aizu.ac.jp>
  Call-ID:627802098@judo.u-aizu.ac.jp
  CSeq: 3 INVITE
  Contact:<sip:sabbir@123.456.789.101>
  Content-type: application/sdp
  Content-Length: 110

  v=0
  o=sabbir 2345 3345 IN IP4 judo.u-aizu.ac.jp
  c=IN IP4 123.456.789.101
  m=audio 2410 RTP/AVP 0
  a=sendonly

A 200/ok from P₃ returned to P₁ confirms the implicit mute.

SIP/2.0 200 OK
Via: SIP/2.0/UDP 123.456.789.101
From: sabbir <sip:sabbir@judo.u-aizu.ac.jp>
To: ashir <sip:ashir@gifu.u-aizu.ac.jp>;
tag=659882292
Call-ID:627802098@1judo.u-aizu.ac.jp
CSeq: 3 INVITE
Contact:<sip:ashir@123.456.789.103>
Content-type: application/sdp
Content-Length: 110
v=0
o=sabbir 2345 3345 IN IP4 sound.u-aizu.ac.jp
c=IN IP4 123.456.789.103
m=audio 2410 RTP/AVP 0
a=recvonly

4.4. Attend

As illustrated by Figure 10, P₁ sends a re-invite to all other participants (except for P₂) with a modified sdp attribute, a=recvonly, who respond with 200/ok including a=sendonly along with other sdp attributes. A one-way rtp media connection is thus established with other participants (excepting P2), so P₂ is attended by P₁.

Figure 10. Attend Call Flow

  INVITE sip:ashir@gifu.u-aizu.ac.jp
  SIP/2.0
  Via: SIP/2.0/UDP 123.456.789.101
  From: sabbir <sip:sabbir@judo.u-aizu.ac.jp>
  To: ashir <sip:ashir@gifu.u-aizu.ac.jp>
  Call-ID:627802099@judo.u-aizu.ac.jp
  CSeq: 4 INVITE
  Contact:<sip:sabbir@123.456.789.101>
  Content-type: application/sdp
  Content-Length: 110
  v=0
  o=sabbir 2345 3345 IN IP4 judo.u-aizu.ac.jp
  c=IN IP4 123.456.789.101
  m=audio 2410 RTP/AVP 0
  a=recvonly

The 200/ok sequence looks like

SIP/2.0 200 OK
Via: SIP/2.0/UDP 123.456.789.101
From: sabbir <sip:sabbir@judo.u-aizu.ac.jp>
To: ashir <sip:ashir@gifu.u-aizu.ac.jp>;
tag=659882293
Call-ID:627802099@1judo.u-aizu.ac.jp
CSeq: 4 INVITE
Contact:<sip:ashir@123.456.789.104>
Content-type: application/sdp
Content-Length: 110
v=0
o=sabbir 2345 3345 IN IP4 sound.u-aizu.ac.jp
c=IN IP4 123.456.789.104
m=audio 2410 RTP/AVP 0
a=sendonly

5. Conclusion and Future Work

In ordinary conversation, participants generally observe turn-taking, as in a cdma (collision detection, multiple access) protocol with discretionary backup. That is, an utterance that collides with another will cause one or both of the simultaneous speakers to stop and wait until a break before repeating.

One might wonder what happens to such conversational turn-taking in the presence of asymmetric media filters and the absence of a moderator. Narrowcasting features - like blocklists, side channels, and call-within-a-call - complicate teleconferences, since a deafened conversant might not be aware that another is talking and multiple sources might speak at once. If some avatars in a conference are muted or deafened to some other participants, without formal floor control there is a danger of some “talking on top of” others. In the absence of common floor control, won't private chats and decentralized control lead to anarchy? Without “traffic signals,” how can collisions be avoided?

In fact, such parallel conversation streams are not a problem. For example, if two participants set up a private side-conference using narrowcasting commands, even though their utterances might collide with others', they wouldn't expect or want others to stop conversing. Rather they “listen with one ear” to ongoing conversations while enjoying their own caucus. Listeners can still untangle conversational threads, by context, voice quality, etc. Just as in real social contexts, including informal gatherings like parties, multiple simultaneous speakers are analyzable. Even “linear” conversations like formal meetings might have some subsets of conversants whispering among themselves while a main speaker is talking. Narrowcasting interfaces will be even more useful when extended by spatial audio and attenuation based on mutual virtual position (source projection, sink bearing, and distance), distributing the respective voices across a soundscape.

The status of each participant's privacy in terms of the media relationship with other participants requires consideration. In this article, we have introduced a design of new features for multimedia conferencing systems. These features could provide enhanced conference functions at the user end, “the edge of the network,” rather than at the server. As a result, a conference participant (not an administrator) could easily control media transmission. We also described the design of these features and method of implementation within the standard sip framework.

Future challenges include developing an algorithm for role-based policy, and adaptive media-mixing at a centralized media mixer for subscribed users.

Bibliography

[ACA04] Mohammad Sabbir Alam, Michael Cohen, and Ashir Ahmed, A Case Study of VoIP Performance Across Different Networks, Proc. icece: 3rd Int. Conf. on Electrical & Computer Engineering (Dhaka), December 2004, pp. 295—298, isbn 984-32-1804-4.

[ACA05] Mohammad Sabbir Alam, Michael Cohen, and Ashir Ahmed, Design of Narrowcasting Implementation in Sip, Proc. HC-2005: Eighth Int. Conf. on Human and Computer (Aizu-Wakamatsu), August 2005, pp. 255—260.

[ACA07] Mohammad Sabbir Alam, Michael Cohen, and Ashir Ahmed, Narrowcasting: Controlling Media Privacy in Sip Multimedia Conferencing, 4th ieee Consumer Communications and Networking Conference ccnc 2007 (Las Vegas), January 2007, pp. 110—115, isbn 1-4244-0667-6.

[CBP08] Gonzalo Camarillo, Mary Barnes, Jon Peterson, Cullen Jennings, and Oscar Novo, Session Initiation Proposal Investigation (Sipping), 2008, www.ietf.org/html.charters/sipping-charter.html, Last Accessed July 11th, 2008.

[Coh00] Michael Cohen, Exclude and include for audio sources and sinks: Analogs of mute & solo are deafen & attend, Presence: Teleoperators and Virtual Environments, 9 (2000), no. 1, 84—96, issn 1054-7460.

[FAD06] Owen Noel Newton Fernando, Kazuya Adachi, Uresh Duminduwardena, Makoto Kawaguchi, and Michael Cohen, Audio Narrowcasting and Privacy for Multipresent Avatars on Workstations and Mobile Phones, ieice Trans. on Information and Systems E89-D, (2006), no. 1, 73—87, issn 0916-8532.

[FCDK05] Owen Noel Newton Fernando, Michael Cohen, Uresh Chanaka Duminduwardena, and Makoto Kawaguchi, Duplex narrowcasting operations for multipresent groupware avatars on mobile devices, ijwmc: Int. J. of Wireless and Mobile Computing, 1 (2005), no. 5, Special Issue on Mobile Multimedia Systems and Applications, issn 1741-1084.

[HJ08] M. Handely and V. Jacobson, rfc 2327 sdp: Session Description Protocol, 1998, www.ietf.org/rfc/rfc2327.txt, Last Accessed July 11th, 2008.

[IT03] ITU-T, itu-t Recommendation H.323 (07/2003): Packetbased Multimedia Communications Systems, 2003, http://www.itu.int/rec/T-REC-H.323-200307-S/en, Series H: Audiovisual and multimedia systems, Last Accessed July 11th, 2008.

[Joh04] Alan B. Johnston, Sip: Understanding the Session Initiation Protocol, Artech House, London, 2004, isbn 1580531687.

[JRPJ08] Alan Johnston, Adam Roach, Jon Peterson, and Cullen Jennings, Centralized Conferencing (xcon), 2008, www.ietf.org/html.charters/xcon-charter.html, Last Accessed July 11th, 2008.

[KSW02] Petri Koskelainen, Henning Schulzrinne, and Xiaotao Wu, A sip-based Conference Control Framework, nossdav '02: Proc. 12th Int. Wkshp. on Network and Operating Systems Support for Digital Audio and Video New York, NY, ACM Press, 2002, pp. 53—61, isbn 1-58113-512-2.

[Roa08] A. B. Roach, rfc 3265 - Session Initiation Protocol (sip) Specific Event Notification, 2002, www.ietf.org/rfc/rfc3265.txt, Last Accessed July 11th, 2008.

[Ros07] J. Rosenberg, rfc 4825: The Extensible Markup Language (XML) Configuration Access Protocol (xcap), may, 2007, www.ietf.org/rfc/rfc4825.txt, Last Accessed July 11th, 2008.

[RSC08] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, R. Sparks, M. Handley, and E. Schooler, rfc 3261: sip: Session Initiation Protocol 2002, www.ietf.org/rfc/rfc3261.txt, Last Accessed July 11th, 2008.

[SKBR03] Paxton J. Smith, Peter Kabal, Maier L. Blostein, and Rafi Rabipour, Tandem-Free VoIP Conferencing: A Bridge to Next-Generation Networks, ieee Communications Magazine, 41 (2003), no. 5, 136—145, issn 0163-6804.

[SNS01] Kundan Singh, Gautam Nair, and Henning Schulzrinne, Centralized Conferencing using sip, Proc. Internet Telephony Workshop, April 2001, New York.

Volltext ¶

Volltext als PDF ( Größe: 968.8 kB )

Lizenz ¶

Jedermann darf dieses Werk unter den Bedingungen der Digital Peer Publishing Lizenz elektronisch übermitteln und zum Download bereitstellen. Der Lizenztext ist im Internet unter der Adresse http://www.dipp.nrw.de/lizenzen/dppl/dppl/DPPL_v2_de_06-2004.html abrufbar.

Empfohlene Zitierweise ¶

Mohammad Sabbir Alam, Michael Cohen, and Ashir Ahmed, Articulated Narrowcasting for Privacy and Awareness in Multimedia Conferencing Systems and Design for Implementation Within a SIP Framework. JVRB - Journal of Virtual Reality and Broadcasting, 5(2008), no. 14. (urn:nbn:de:0009-6-14724)

Bitte geben Sie beim Zitieren dieses Artikels die exakte URL und das Datum Ihres letzten Besuchs bei dieser Online-Adresse an.

JVRB - Journal of Virtual Reality and Broadcasting

Sektionen

Artikelaktionen

HC 2006

Articulated Narrowcasting for Privacy and Awareness in Multimedia Conferencing Systems and Design for Implementation Within a SIP Framework

Zusammenfassung

Keywords

Articulated Narrowcasting for Privacy and Awareness in Multimedia Conferencing Systems and Design for Implementation Within a SIP Framework

1. Introduction

2. Conventional Conference Architecture and Call Control Limitations

2.1. Architecture

2.2. Requirements of a Flexible Conferencing System

2.3. Related Research

3. Enhancement of Conferencing System: Narrowcasting

3.1. Narrowcasting Concept

3.1.1. Source Functions: Mute and Select

3.1.2. Sink Functions: Deafen and Attend

3.2. Sip for Multimedia Conferencing

3.3. Session Establishment in sip

3.4. The Conferencing Model

4. Design for Implementation of Narrowcasting Attributes in sip

4.1. Mute

4.2. Deafen

4.3. Select

4.4. Attend

5. Conclusion and Future Work

Bibliography

Volltext ¶

Lizenz ¶

Empfohlene Zitierweise ¶

Articulated Narrowcasting for Privacy and Awareness
in Multimedia Conferencing Systems
and Design for Implementation Within a SIP Framework