Nate Anderson writes an interesting blurb on the P2P Next research project in the Netherlands. The researchers hope to build a platform suitable for live TV delivery over the Internet:
Dutch academic Dr. Johan Pouwelse knows BitTorrent well, having spent a year of his life examining its inner workings. Now, as the scientific director of the EU-funded P2P-Next team, Pouwelse and his researchers have been entrusted with €19 million from the EU and various partners, and what they want in return is nothing less than a “4th-generation” peer-to-peer system that will one day be tasked with replacing over-the-air television broadcasts.
P2P-Next is the largest publicly-funded team in the world working on such technology (though plenty of researchers at Microsoft, IBM, and countless tiny startups are also racing to deliver a better P2P experience), and today the team launched a trial program designed to test its progress to date.
What sets the project apart from the traditional BitTorrent architecture is its focus not on downloadable video, but on live streaming. Current BitTorrent implementations, focused as they are on offering easy access to downloadable content, aren’t well suited to delivering live streaming TV across the Internet, but Pouwelse is convinced that this is the future. There’s “no doubt that TV will come through the Internet in a few years,” he told Ars earlier this week. Obviously, deployment of such a system depends on consumer electronics firms and broadcasters, but Pouwelse’s job is to make sure that the technology is ready when they are.
P2P has a lot of issues and problems as a delivery vehicle for live TV, so I don’t think this is a good approach, but a system that caches popular content in numerous places has the potential to distribute large and popular files with little redundant delivery. The important feature of such a system is its caching capability, however, not its “peer-to-peerness.”
See Torrent Freak for many more details.
Richard, P2P will always be the poor man’s cache and delivery platform that will never work as well as a distributed centralized approach. Its key advantage is that it can offset server and bandwidth costs to someone else but if the goal is to maximize performance while minimizing bandwidth utilization, trying to use the very fringes of the Internet to deliver data is never going to be that efficient because it requires the data to travel longer distances and cover more networks.
What’s more efficient? Caching on or near the DSLAM or CMTS or caching on the end points and using up what little upstream capacity there is? Now this may not be a big problem for the DSL, U-Verse, or FTTH networks because the upstream is dedicated bandwidth but it would crush the shared upstreams of the Cable networks. But even then, the latency and jitter will be much higher when the content is coming from the edges.
The biggest objection consumers will have is paying the extra bill on power consumption, extra storage, and getting their upstream clogged up so that someone else doesn’t need to build the servers in a data center. And why should Broadband providers build Symmetric networks when it’s so much cheaper to simply host the content in the data center inside the cloud?
Even if we had symmetrical broadband networks, just because you scatter the system across the edges doesn’t necessarily mean it’s more reliable than centralizing it. Take the storage system for example, you’d have to have at least 2 or more copies for the system to be redundant and more likely you’d need 4 or more copies because you don’t know when that user on the end point might want to use their upstream and disable the system. With a centralized system, you only need 1 redundant drive for every 5 hard drives because you can use technologies like RAID Level 6. That means we’re talking about a 20% overhead on redundant storage in a data center compared to a 300% storage redundancy overhead in a distributed system. This whole thing just doesn’t make any sense.
It’s not clear what you mean by a “distributed centralized approach,” George, because that’s what P2P enables, if done correctly.
The biggest advantage of P2P, other than cost-shifting, is scalability, as the data center server has only so much CPU and bandwidth available.
Content distribution will clearly continue to happen as disk drives, CPUs, and bandwidth get cheaper, so there’s not much to be gained by hiding your head in the sand and pretending that permanent laws of computer science are against distribution.
And as far as your argument on power goes, that’s quite bizarre this late in the day. We did Wake On LAN in the 90s to adapt power consumption to activity, and it still works.
What I mean by distributed centralized approach is similar to the CDN model but you move the servers as close to the edges as possible but not actually so far that you move it to the end-points. So that means servers near the DSLAM and CMTS. Once you get to the end points, you lose the ability to leverage RAID technology with low redundancy overhead and you need to store 4 copies of the same file in 4 separate homes rather than storing it on a RAID 6 array that only has 20% overhead for redundancy.
A lot of set-top boxes like the $100 Roku box uses small flash devices because many customers don’t want any noisy hard drives in their system. But like I said, the biggest problem with P2P systems is that it saturates the consumer’s upstream which is a bigger problem for the consumer than the ISP.
At the end of the day, it’s an economic decision which distribution model dominates. As with anything, it will always be a combination of solutions where multiple technology platforms will thrive at the same time. P2P will be used where it makes sense for free out-of-order download services like Vuze and CDN will make the most sense for fee based reliable in-order video on demand. P2P will always gobble up any excess capacity on the network under a well managed network or they’ll gobble up way more capacity than the network can reasonably support under a dumb network. So long as last-mile networks are asymmetrical, this extra capacity is only going to serve a small percentage of the population at full download speed or serve a large percentage of the population at a very slow speed.
But to suggest that P2P will replace everything else in my view doesn’t make sense economically. Taking distribution to the edges of the network simply costs more money in storage and it would require all symmetrical broadband networks. This in my opinion is a roundabout way of delivering content because consumers primarily want to consume and only a small percentage of them want to serve. I don’t see the benefit of trying to create a symmetrical last-mile and force consumers to host servers when the demand simply isn’t there. At the end of the day, someone has to bear these extra costs and it’s always going to be the consumer.
P2P’s biggest advantage is scalability. Think about that.
We’re still at the beginning of the P2P era, and we make a serious mistake to assume that the things we can do with P2P today represent the limit of the technology. The IETF and DCIA are working to housebreak P2P’s bandwidth appetite, which is currently its most objectionable feature.
Many of is access the Internet from multiple locations and multiple devices, for example, and we need to access the same set of personal data wherever we are. A home-based server is one way to do this, and one of the features of such a server is the ability to download content from diverse sources as a background activity. it makes sense for house-broken, legal P2P to be one of the options for this, and I see no reason it can’t use RAID or any other disk technology natively.
I don’t think users are wholly comfortable with entrusting personal data to a third party locker service, for example; I’d much rather have my personal data under my control.
I’m not saying that P2P doesn’t have a future Richard and I think you’re missing my point. I’m saying that any time someone says X technology will revolutionize and replace Y technology; take it with a grain of salt. That’s not to say that X won’t make a huge impact in the future.
Regarding RAID, I think you missed my point. I’m saying when you centralize your storage; it’s a lot less expensive than putting it at the end-points. I’m saying if you put it at the end points and you want it to be available and reliable, you’ll need at least 4 copies of it in 4 separate homes. If you store it centrally, you will only need 1 copy plus two parity drives for every 10 drives holding data.
Disk drives are cheap enough now that RAID 1 mirroring is a practical and inexpensive backup system, George, and it doesn’t have privacy issues.
Richard,
You just tried to make the scalability argument and I’m trying to examine this argument closely. I am not talking about RAID at each home; that would make the P2P distribution solution even MORE expensive. I’m saying that you need at least a separate copy in 4 different homes for the file to be reliable or at the minimum 2 places. It’s already bad enough that you expect to turn people’s homes in to servers; it’s worse if you expect them to be reliable and implement in-home redundancy.
At the end of the day, any distribution system requires a storage system to hold the terabytes upon terabytes of video. If you put it in the home, you’re going to at least require 2 to 4 times the amount of storage than putting it on a dedicated centralized storage server.
By the way, one of the most disingenuous labels I’ve seen yet is the DCIA’s characterization of P2P as “distributed computing.” P2P nodes do no computing except a small amount to figure out how to redistribute data VERBATIM, with no processing whatsoever! And perhaps they invest a bit to see just how thoroughly they can saturate the pipes and how effectively they can rob bandwidth from more important uses.
It is a “distributed server network” though. I don’t consider P2P a very well-behaved application at the moment, but with appropriate management it has some potentially interesting uses.
Unfortunately, P2P can never be as efficient as a simple, direct download — which in turn can never be one millionth as efficient as a broadcast medium such as the airwaves. There is only one thing it’s good at: hiding the source of pirated content. Which is why it was invented….
Efficiency is nice, but there are other attributes that make engineered systems valuable. The scalability of P2P is its best feature, and it has that hands down over centralized client/server.
The main scalability aspect of P2P is its ability to use up any spare capacity on a Broadband network and any available storage on home networks. But in terms of total raw material and cost of building a reliable distribution system for a given capacity, the distributed data center model is the cheapest by far. From a distance standpoint, what’s closer? Network-to-client or client-to-network-to-client? It’s obvious that the former is closer.
I can see P2P harnessing what’s left over, but I don’t see the point of going out of your way to build more storage capacity in to P2P networks than what the consumers are naturally willing to bear. I also don’t see the benefit of building symmetric networks when consumers for the most part don’t demand it.
If there is no fundamental efficiency in a P2P system and the total cost of building the system is more expensive, then it isn’t scalable.
Once again, I’m not making an efficiency argument, I’m making an argument for scalability. As the demand for a piece of content grows, P2P ensures that the bandwidth grows with it. and that applies to CPU and disk as well as communications bandwidth.
P2P does have some advantages over centralized client/server, and there’s no sweeping that fact under the rug. Like DPI, P2P is a tool that can used for good or ill.
Yes, I very much understand that concept of scalability; it’s why I argued for a hybrid webseeding model. In that webseeding piece I wrote, I said that client-server worked well when there were few users and P2P worked well when there were many users. That’s why I said combining the two concepts give you the best of both worlds.
However, that is simply a repeater/multiplier system bolted on top of the client-server modem and it doesn’t require substantial storage on the end points. It does not replace the need for centralized storage and centralized seeding. It’s also limited by the upload speed of individual Broadband users if you want an in-order on-demand stream. So until upstreams are substantially closer to downstream in ratio, this won’t work.
The CDN model on the other hand doesn’t have to worry about broadband upload speeds. The down side with CDN is that the distributor must bear the burden of paying for it.
The CDN model also has a hard limit on the number of streams it can serve, but P2P is more dynamic. Most dynamic systems are less efficient than static systems, but they have a place.
Richard, the CDN model’s only limit is based on how much bandwidth you give the servers and how many servers you distribute out there. If you want in-order high-speed reliable service, CDN is the only way to go and the market has proven this.
P2P or even P4P will always result in double the last-mile network utilization.
CDN’s don’t deal with with transient effects, such as high peak demand for popular content. We’ve seen several examples of a popular piece of content creating meltdown on CDN servers, going back to the Victoria’s Secret incident.
P2P has some definite uses, and does some things quite well.
“P2P has some definite uses, and does some things quite well.”
No arguments there Richard, and I’ve always argued for a hybrid Client-Server and P2P approach. All I’m saying is that with any technology or approach, it won’t become the only solution.
Nobody ever said otherwise. At the FCC hearing today, Mark Cuban touted Multicast, just like I did at Harvard. That’s the ticket.
I agree. Switching Cable’s static circuit-switched allocation of resources of analog/digital video to a dynamic need-based Multicast system would help a lot, but that would only work if the network operator is permitted to protect their own video programming from network congestion. Once this is done, only the channels being watched will be transmitted and it will allow Cable networks to have more than an order of magnitude more bandwidth than they do today left over for Internet.
It might also be possible to use dynamic quality levels. For example, if only 1 person watches a certain program, the video quality might only be 2 Mbps 480p. If 5 people watch the same program, the video quality might be 5 Mbps 720p but the total bandwidth consumed is still half of what 5 people watching five 2 Mbps streams would be. Then if 30 people are watching the same show, allocate 30 Mbps of multicast bandwidth at 1080p video quality. This would give users a choice of several hundred channels while efficiently using the existing pipes.
Dynamic transcoding is actually a common technique in IPTV systems today.
Getting into Switched Digitial Video is a fairly radical infrastructure change, but many cable systems are going that way. It plays hell with Cable Cards, but Cable Labs just green-lighted a couple of SDV boxes, so it’s just a matter of money.