Stories
Slash Boxes
Comments

SoylentNews is people

Meta

Log In

Log In

Create Account  |  Retrieve Password


Site News

Join our Folding@Home team:
Main F@H site
Our team page


Funding Goal
For 6-month period:
2022-07-01 to 2022-12-31
(All amounts are estimated)
Base Goal:
$3500.00

Currently:
$438.92

12.5%

Covers transactions:
2022-07-02 10:17:28 ..
2022-10-05 12:33:58 UTC
(SPIDs: [1838..1866])
Last Update:
2022-10-05 14:04:11 UTC --fnord666

Support us: Subscribe Here
and buy SoylentNews Swag


We always have a place for talented people, visit the Get Involved section on the wiki to see how you can make SoylentNews better.

posted by on Sunday October 30 2016, @02:45PM   Printer-friendly
from the no-fishing-for-me-this-morning dept.

Right, so there's currently a DDoS of our site specifically happening. Part of me is mildly annoyed, part of me is proud that we're worth DDoS-ing now. Since it's only slowing us down a bit and not actually shutting us down, I'm half tempted to just let them run their botnet time out. I suppose we should tweak the firewall a bit though. Sigh, I hate working on weekends.

Update: Okay, that appears to have mitigated it; the site's functional at a reasonable rate of responsiveness.

Update2: Attack's over for now. You may go about your business.

posted by martyb on Monday October 10 2016, @01:26AM   Printer-friendly
from the fun-with-numbers dept.

Since the launch of SoylentNews in February of 2014, there have been 274,870 comment moderations made against the 412,100 comments that our community has posted to our site. Who has posted the most comments? Who garnered the most up-moderations? The most down-moderations?

Such simple questions, but they led to a fun bit of DB querying. The results surprised me, and I thought others might be interested, as well. Most surprising to me was the assessment of comments from Anonymous Cowards.

[Continues...]

Who received the most moderations?

For better or worse, to whom did Soylentils direct their greatest moderation effort?

NICK UID TOTAL DOWN UP NET
The Mighty Buzzard 18 2260 626 1634 1008
takyon 881 2315 103 2212 2109
aristarchus 2645 2494 615 1879 1264
c0lo 156 2717 183 2534 2351
Thexalon 636 3225 83 3142 3059
Ethanol-fueled 2792 3447 1238 2209 971
VLM 445 4401 346 4055 3709
Runaway1956 2926 4531 992 3539 2547
frojack 1554 5855 593 5262 4669
Anonymous Coward 1 78936 13002 65934 52932

The single greatest target of moderation was the "Anonymous Coward" with 78,936 moderations. This was followed by frojack, Runaway1956, VLM, Ethanol-fueled, and Thexalon who garnered over 3000 moderations each.

Who had the most down-moderations?

Here, only the number of down moderations was considered — it mattered not whether it was Flamebait or Troll — they all counted the same.

NICK UID TOTAL DOWN UP NET
VLM 445 4401 346 4055 3709
Hairyfeet 75 1620 387 1233 846
MichaelDavidCrawford 2339 1513 387 1126 739
frojack 1554 5855 593 5262 4669
aristarchus 2645 2494 615 1879 1264
The Mighty Buzzard 18 2260 626 1634 1008
jmorris 4844 2144 753 1391 638
Runaway1956 2926 4531 992 3539 2547
Ethanol-fueled 2792 3447 1238 2209 971
Anonymous Coward 1 78936 13002 65934 52932

Once again, our prolific AC topped the list with 13,002 down-mods. Ethanol-fueled was the only other user who topped 1000 down-mods, coming in with 1238. Runaway1956 made a valiant showing with 992 down-mods.

Who had the most up-moderations?

In the eyes of the community, who most often received an up-mod? Again, no consideration was given for the nature of the up-mod — Insightful, Interesting, or Informative — all were considered the same.

NICK UID TOTAL DOWN UP NET
aristarchus 2645 2494 615 1879 1264
Phoenix666 552 2184 80 2104 2024
Ethanol-fueled 2792 3447 1238 2209 971
takyon 881 2315 103 2212 2109
c0lo 156 2717 183 2534 2351
Thexalon 636 3225 83 3142 3059
Runaway1956 2926 4531 992 3539 2547
VLM 445 4401 346 4055 3709
frojack 1554 5855 593 5262 4669
Anonymous Coward 1 78936 13002 65934 52932

Once again AC reins supreme with 65,934 up-mods. This was followed by frojack with 5,262 and VLM with just over 4000.

Who had the highest net-moderation?

Putting it all together — subtracting the number of down-mods from the number of up-mods — who had the highest net moderation on our site?

NICK UID TOTAL DOWN UP NET
wonkey_monkey 279 1754 117 1637 1520
maxwell demon 1608 1786 55 1731 1676
Phoenix666 552 2184 80 2104 2024
takyon 881 2315 103 2212 2109
c0lo 156 2717 183 2534 2351
Runaway1956 2926 4531 992 3539 2547
Thexalon 636 3225 83 3142 3059
VLM 445 4401 346 4055 3709
frojack 1554 5855 593 5262 4669
Anonymous Coward 1 78936 13002 65934 52932

Once again, the shy but prolific AC tops the list with a net of 52,932 mod points. Only one other Soylentil was able to surpass 4000: frojack with 4,669. Two other Soylentils exceeded 3000: VLM with 3709 and Thexalon with 3059.

Who hath pointy horns?

Who managed to acquire the most down-mods as a percentage of all moderations on their comments? For a tie, number of moderated comments is the second sort field. Who is the devil in our midst?

NICK UID TOTAL #DOWN %DOWN #UP %UP NET
scarboni888 5061 1 1 100.00 0 0.00 -1
MooCow 6048 1 1 100.00 0 0.00 -1
cybergimli 436 2 2 100.00 0 0.00 -2
rancidman 769 2 2 100.00 0 0.00 -2
rmdingler 1038 2 2 100.00 0 0.00 -2
SoylentsISay 1331 2 2 100.00 0 0.00 -2
stupid 2631 2 2 100.00 0 0.00 -2
contrapunctus 3495 2 2 100.00 0 0.00 -2
killal -9 bash 2751 5 5 100.00 0 0.00 -5

Pfft, just a few minor imps around here. killal -9 bash topped (bottomed?) the list with 5 down-mods out of 5 moderations.

Who earned a Halo?

Whose comments had the best percentage of up-mods to total-mods? And in the case of ties, received the most up-mods? Who are the angels among us?

NICK UID TOTAL #DOWN %DOWN #UP %UP NET
dx3bydt3 82 69 0 0.00 69 100.00 69
romlok 1241 70 0 0.00 70 100.00 70
Hawkwind 3531 75 0 0.00 75 100.00 75
jdccdevel 1329 78 0 0.00 78 100.00 78
rleigh 4887 102 0 0.00 102 100.00 102
DrMag 1860 103 0 0.00 103 100.00 103
SrLnclt 1473 117 0 0.00 117 100.00 117
Joe 2583 126 0 0.00 126 100.00 126
Aiwendil 531 164 0 0.00 164 100.00 164

Here, it appears we've got a flock of angels, or at least people who know which way the wind blows. All folks listed here scored 100.00% meaning all of their moderations were up-mods. Aiwendil topped our list with 164, and we had 4 others — Joe, SrLnclt, DrMag, and rleigh — who each had over 100 such comment moderations... not even a single down-mod among them!

I must admit I was surprised to see the sheer number of positive moderations of AC comments, and the fact that 83.5% of those mods were positive.

[Update: Added two tables, one each for top percentage of down-mods and of up-mods. -Ed.]

posted by NCommander on Tuesday September 20 2016, @01:00PM   Printer-friendly
from the now-you-can-be-1337-by-knowing-what-a-far-call-is dept.

The Retro-Malware series is an experiment on original content for SoylentNews, written in the hopes to motivate people to subscribe to the site and help grow our resources. The previous article talked a bit about the programming environment imposed by DOS and 16-bit Intel segmented programming; it should be read before this one.

Before we get into this installment, I do want to apologize for the delay into getting this article up. A semi-unexpected cross-country drive combined with a distinct lack of surviving programming documentation has made getting this article written up take far longer than expected. Picking up from where we were before, today we're going to look into Terminate-and-Stay Resident programming, interrupt chaining, and get our first taste of how DOS handles conventional memory. Full annotated code and binaries are available here in the retromalware git repo.

In This Article

  • What Are TSRs
  • Interrupt Handlers And Chaining
  • Calling Conventions
  • Walking through an example TSR
  • Help Wanted

As usual, check past the break for more. In addition, if you are a licensed ham operator or have ham radio equipment, I could use your help, check the details at the end of this article.

[Continues...]

What Are TSRs?

For anyone who used DOS regularly, TSRs (short for Terminate and Stay Resident) were likely a source of both fun and frustration. Originally appearing in DOS 2.0, TSRs, as the name suggests, are programs that exit but leave some part of their code around in memory. TSRs are primarily used to provide device drivers, extended APIs, or hooks that other applications can take advantage of. At the same time, they also could be used (as we will be doing) to install invisible hooks to modify, change, or log system behaviors. In that sense, they can be considered broadly equivalent to extensions on classic Mac OS. The BIOS could be considered a special type of TSR as it's always available in memory to provide services to the operating system and applications.

From a technical perspective, a TSR is any application that executes int 21h with the right options. Ralph Brown's interrupt guide has this to say on DOS's API for TSRs:

DOS 2+ - TERMINATE AND STAY RESIDENT

AH = 31h
AL = return code
DX = number of paragraphs to keep resident

Return:
Never

Notes: The value in DX only affects the memory block containing the PSP; additional
memory allocated via AH=48h is not affected. The minimum number of paragraphs
which will remain resident is 11h for DOS 2.x and 06h for DOS 3.0+. Most TSRs can
save some memory by releasing their environment block before terminating
(see #01378 at AH=26h,AH=49h). Any open files remain open, so one should
close any files which will not be used before going resident; to access a file
which is left open from the TSR, one must switch PSP segments first (see AH=50h)

Well, for most people, I suspect that is as clear as mud. Let me try and explain it a bit better. Essentially, when a program flags to DOS that it wants to TSR, DOS simply leaves the amount of memory marked in DX alone, and marks those paragraphs (which are 16 bytes) as 'in use' so neither it nor any other well-behaving programs will attempt to use them. No relocation or copying is done as part of this process; the memory is simply marked dead, and left 'as is' as we'll see below.

This is problematic for a number of reasons. As I mentioned in the previous article, when in real mode, Intel processors can only access up to 1 MiB of memory, and it's the area where all applications, drives and device address space needs to squeeze into. Of this, only 640 kiB are normally available to applications (which is known as conventional memory). If a TSR is too large, or too many are loaded, its is very easy to run out of RAM to do anything useful with the machine. To make matters worse, DOS provides absolutely no mechanism to manage or uninstall TSRs. (Once an application is resident, it's staying there unless it is specifically designed to unhook itself and free itself from memory.) Combine that with the fact that there's no 'official' way of doing so in the DOS APIs.

This quickly lead to an era where you might need a specific boot floppy for a given application so as to have its TSRs available (such as mouse or network drivers) — and nothing else — so that there would be enough conventional memory left to fit everything in. While several third-party efforts tried to standardize TSR installation/removal — such as TesSeRact — none of them became a true de-facto standard. Furthermore, it is very possible for a TSR removal to leave memory in a fragmented state which could break other applications. An entire cottage industry of memory optimizers quickly sprang up which could load TSRs into high memory.

At this point, you may be wondering "If TSRs are so miserable, why use them?". The answer, unfortunately, is that it is the only way on DOS to provide any sort of extended functionality. DOS has no concept of shared libraries or multitasking; it was TSR or bust. This brings us to our next topic: interrupt handling.

Interrupt Handlers

While I touched on interrupts in the previous article, I didn't go into too much detail. Interrupts, simply put, are special signals sent to the processor to tell it to stop what it's doing and do something else immediately. These interrupts can be generated by either hardware or software. Interrupts essentially operate like this:

  • Processor is doing work.
  • Interrupt occurs
  • Processor saves location and jumps to interrupt handler
  • Interrupt handler runs
  • Interrupt handler finishes, and returns to the original task

When an interrupt occurs, the processor looks at the Interrupt Vector Table (IVT) located at 0x0 to determine where it needs to jump to handle that interrupt. The function that handles an interrupt is known as an Interrupt Service Routine (ISR). Assuming there is a valid handler address in the IVT, the processor does a far call to the IVT and immediately continues execution. A 'bare bones' interrupt handler looks something like this:

previous_hook_offset: dw 0
previous_hook_segment: dw 0

hook:
	; When we come into an interrupt, only the
	; code segment and instruction pointer are preserved
	; for us. It's the responsibility of the handler to
	; preserve this information.

	; This is saved on the application's local stack, which is fine
	; for now (FreeDOS does the same thing internally) as long as
	; we're not putting any large items on it. We'll look at setting
	; up a local stack later.

	pushf ; Save flags
	pusha ; push all general registers to the stack

	; Setup segments
	push ds
	push es

	; For interrupt handlers, CS=DS normally, and SS either points at:
	; the application stack (aka, whatever was running before we were)
	; or at a local stack setup by the TSR.
	; On x86, it's not possible to directly copy from one segment register
	; to another, so we'll use AX as a scratch:
	mov ax, cs
	mov ds, ax
	mov es, ax

	; Let's add a "hello world" hook:

	; NOTE: Normally it's a bad idea to call DOS interrupts in a TSR
	; because DOS itself is not re-entrant. However, as in this example,
	; we've hooked the unused 0x66, which DOS does not call out of the box,
	; which means we'll never be in this ISR while we're in DOS.
	; If this were real code, we would have to check the INDOS flag for sanity.
	mov ah, 9
	mov dx, hello_world_str
	int 0x21

	; DOS compatability "quirk?". On DOSBox (which I initially tested this on)
	; there's a default entry in the IVT for all interrupts in F000:xxxx.
	; Documentation suggestions that this is also the default behavior
	; for MS-DOS though I can't confirm it.
	;
	; FreeDOS, on the other hand, leaves unused INTs initialized to 0000:0000
	; so blindly far calling it causes a fault. So we need to check if the
	; segment is 0000, and skip chaining if that's the case

	cmp word [previous_hook_segment], 0x0000
	je skip_chain

	; Chain to other TSRs
	pushf ; pushf is required because iret expects to pop flags
	call far [previous_hook_offset]

	skip_chain:

	; We're done, restore to previous state
	pop es
	pop ds
	popa
	popf

	; To return from an interrupt, we use the special iret instruction
	iret

Quite a bit of code for not doing much. As the code comments explain, the interrupt handler has to preserve any information in the registers it wants to use. For this example, we just save everything with a pushf instruction followed by pusha instruction, which puts the FLAGS register followed by all the general purpose registers (AX-DX, SI, DP, BP, SP) on the stack. Preserving flags in an ISR is extremely important since FLAGS is where things like comparison results are stored; if you corrupt FLAGS, it's completely possible that an application evaluates an "if" statement the wrong way and becomes a source of hard-to-impossible-to-find bugs.

ISRs are somewhat notorious in that they appear deceptively easy to code, and absolutely disastrous if you get it wrong. One of the major things to be aware of is that it's possible for an interrupt to be interrupted. For example, if your interrupt handler is running, and someone taps on the keyboard, the keyboard handler will preempt you. Depending on what you're doing, this might not be a problem, or it could "lock up" the computer. ISRs can turn interrupts on and off with the sti/cli instructions, but an all-too-common bug is forgetting to turn interrupts back on. Raymond Chen, a developer at Microsoft, wrote an entire chapter in his book "The Old New Thing" dedicated to the things that stupid applications do that Windows had to patch around — such as forgetting how to handle interrupts.

The second consequence is that ISRs should be reentrant. For those who are not hugely familiar with computer programming, reentrancy is the ability for a subroutine to be interrupted, then called again safely. For example, if you're listening to keyboard events, it's possible that two events can come at the same time and the second event preempts the first one. Bad Things(tm) happen if you have non-reentrant ISRs. The only reason this is a 'should' vs. a 'must' is that DOS itself is not reentrant; as the comment explains, you can't safely call a DOS interrupt from an ISR. DOS provides a special global flag known as INDOS to let callers know if it's safe to make an interrupt check; it was excluded above for brevity and because we used an unused interrupt.

The final common pitfall for DOS-based ISRs is it is possible for multiple TSRs to hook the same interrupt. For example, App A and App B can both decide they want the same interrupt. Depending on the application, it may chain interrupts down, or it may claim an interrupt entirely for itself. This can lead to infuriatingly complicated issues to debug if the other TSR is not well-behaved. Microsoft and IBM eventually provided built-in TSR multiplexing in DOS in the form of int 2F, but the API is extremely difficult to use and failed to solve many of the inherent issues.

The Stack and Calling Conventions

Let's take a momentary digression from TSRs to look at how functions work and how they interact with the stack. From an instruction perspective, Intel processors provide a "call" opcode which pushes the current instruction pointer to the stack, and then unconditionally jumps to a given location. It doesn't, however, define the behavior of how arguments are passed or the management of the stack. As such, developers have created conventions to specify how the stack and arguments should be passed from one function to another.

For non-programmers, the stack can be considered to be a "working space" where a program can store local variables and temporary information such as result values. In contrast to the heap, stacks are relatively small, and are essentially localized to a given function. For historical reasons, the stack grows 'down' from upper memory addresses to lower memory addresses. The stack pointer SP always points to the top of the stack. When information is pushed to the stack with a "push" operation, the value saved is stored in memory to the location pointed at SP and the register itself is decremented by the size. In contrast, deleting an item from the stack simply increments SP allowing new information to override the old.

For example, let's assume we have a C function with the following prototype:

// By default, most C compilers use the CDECL calling convention on x86
int example(int a, int b) {
	// We'll do stuff here
	return a+b;
}

Unlike most architectures, x86 defines multiple types of calling conventions. Of these, the most common are stdcall (used primarily by Windows), and cdecl (C Declaration). For the code I write, I'm sticking to the cdecl convention for my own sanity. cdecl is what's known as a "caller-based" convention, which means the calling function is responsible for cleaning up the stack at the end of a function. Here's what the calling code looks like in assembly:

example_call:
	mov ax, 4
	mov bx, 5

	; Arguments go in left to right
	push ax
	push bx

	; Under CDECL, names are decorated with a _ to indicate
	; they're a function, so example becomes _example
	call _example

	; Now we need to clean the stack up
	add sp, 4

	; The return value (9) comes back in ax
	; all other registers are smashed (aka their
	; values are not preserved into or out of the
	; function)

Fairly straight forward, right? Let's look at how this function might be implemented so we can discuss the base pointer (BP) as well. Here's what _example looks like:

_example:
	; Setup stack frame
	push bp
	mov bp, sp

	; The stack now has the following layout
	; bp[+2] stack frame
	; bp[+4] int a
	; bp[+6] int b

	; Move values from the stack to registers
	mov ax, [bp+4]
	mov bx, [bp+6]
	add ax, bx

	pop bp
	ret

BP, or the base pointer, can be considered a reference point for where each function begins and ends. Whenever we enter or leave a function, the base pointer forms the base of the stack for that function (hence the name). These reference points are known as stack frames, and since every function copies SP (which always points to the top of the stack) to BP, you can always tell where you are relative to other functions. Debuggers, for example, walk the stack to determine where they currently are by comparing the values of BP to known offsets.

Near and Far Calls

Before we leave the topic of calling conventions, the final point to bring up are near and far calls. In the previous article, I discussed that 16-bit processors can only reference up to 64 kilobytes of memory directly at any given time. As such, if you need to reference code or data outside that 64k window, you need to change the segment so it's pointing in the right location.

For functions, code that's within the same segment is known as a near call. Near calls are equivalent to normal function calls on most other architectures. Far calls in contrast include the required segment, and load CS as part of the function call. Far calls are made by the "call far" instruction, and require the called function to use the "retf" instruction to indicate they need to return far. Far calls have a fairly high performance hit due to the segment change, and thus should be limited as much as possible

In the previous interrupt handler example code, we saw that we had to do a far call to chain to previous TSRs. The reason for this is that interrupt service handling is essentially a special case of a far call; the processor has to change to the ISR's segment in memory. When we chain to another interrupt handler, we have to do the same thing. If you're still confused, the following example will clear things up.

TSRs In Action

So now that we have the basis of TSRs in our heads, let's look at how they're managed and installed by the operating system. To do that, we need an actual DOS installation. While TSRs do work in DOSbox, DOSbox has some unusual quirks with its environment that make it not 100% accurate to actual DOS (for example, all interrupts have an installed default handler; FreeDOS at least does not do this).

Installing FreeDOS

Fortunately for free software, FreeDOS exists which is a (mostly) compatible free software re-implementation of DOS 5. Installation is pretty much identical to what DOS 5 would have been like if it was shipped on a CD vs. floppy disks

*

The CD is bootable, and starting it up in VirtualBox brings up this boot menu.

*

The installer offers to start FDISK to create a boot partition. Users of MS-DOS FDISK should find this more or less identical to the standard FDISK.COM

* *

After which DOS installation takes a few minutes, and then promptly crashes. For reasons I can't figure out, the included JemmEx memory extender refuses to work under VirtualBox. Fortunately, EMM386 is happy to do the job, and after a quick reboot, I get dumped to C:\

After configuring WatTCP, and firing up the built in FTP server, I can copy my TSRs over without issue. Of course, given that DOS uses VESA graphics, I can't copy and paste. Fortunately for my sanity, FreeDOS (and MS DOS) support redirecting the terminal with the CTTY command. After a little bit of fiddling with VirtualBox's settings, I get this:

*

Copy and paste for the win. Anyway, now that we have a decent to use testing environment, let's get into the practical aspect of this.

DOS Memory Layout

After doing a clean reboot of the system, FreeDOS reports the following as its memory usage:

C:\>mem

Memory Type       Total       Used     Free
--------------- ---------  -------- --------
Conventional         639K       50K     589K
Upper                 36K       31K       5K
Reserved             349K      349K       0K
Extended (XMS)    31,680K    5,626K  26,054K
---------------- --------  -------- --------
Total memory      32,704K    6,056K  26,648K

Total under 1 MB     675K       81K     594K

Total Expanded (EMS) 31M (32,571,392 bytes)
Free Expanded (EMS)  25M (26,705,920 bytes)

Largest executable program size   589K (602,672 bytes)
Largest free upper memory block     4K ( 4,096 bytes)
FreeDOS is resident in the high memory area.
C:\>

Lots of numbers, right? We'll do a more in-depth article about the types of memory, but let's do a brief primer here so that the output can be understood. Let's break these down step by step

Conventional

Conventional memory is what applications in DOS generally have available and refers to the lower 640k of the 1 MiB address space. Anything operating in real mode has to fit in this memory area. FreeDOS reports a total of 639k because a very small chunk of RAM at 0x0000 has to be reserved for the processor's interrupt tables, as well as a small part of COMMAND.COM that has to stay resident at all times to aid things like LOADALL. On this specific system, I have a few TSRs already installed to provide network services which is why a 50k block of conventional memory is already used.

Upper/Reserved

Above the 640k line is what's referred to as the "upper memory area", or UMA and is reserved by DOS. The upper memory area also has things like the monochrome and VGA memory buffers, as well as option ROMs, the DOS kernel, and the BIOS shadow map. Normally, this region of memory shouldn't be used by applications, but due to the fact that conventional memory can get very crowded, on most systems there are small but usable sections of memory in these areas, known as UMA blocks. A memory manager can determine which blocks are safe to use, and load applications or data into these chunks, a process known as "loading high". When we get into hiding our TSR, use of upper memory will become very important

Extended Memory

Memory that exists above 1M+64k (that 64k is special, see below), and cannot be directly accessed by real mode. Because neither DOS nor the BIOS can operate in 32-bit/protected mode, and that the 80286 processor could not easily switch from protected mode to real mode, accessing memory above the 1 MiB barrier required various amounts of trickery. Extended memory can extend up from 1 MiB to 4 GiB (which is the architectural limit of 32-bit processors). Accessing extended memory either requires entering protected mode, tricking the processor into unreal mode (which on the 80286 required the LOADALL instruction to put the processor in an invalid state), or using a BIOS service which did one of the previous two options to exchange blocks with conventional memory.

High Memory Area (not shown)

One important line to look at is "FreeDOS is resident in the high memory area." I've stated multiple times that 1 MiB is the limit of what Intel processors can address. As it turns out, this is only a partial truth. Remember that addressing in real mode is done in the form of segment:offset. So what happens if I load a segment value of FFFF?. Well it turns out we can address an additional 64 kilobytes of RAM beyond the 1 MiB barrier. This is known as the high memory area.

Due to many quirks related to the abomination known as A20 (which will get an entire section in the next article), the high memory area requires special rules and methods to access. The short version is that unless you have a memory manager, or are willing to manipulate the A20 line directly (which is dangerous), the HMA is not usable by general applications. We'll look more at this in a future article.

TSR Loading

So with that all out of the way, let's look at how a TSR is loaded. In the github repository, there's an example TSR known as tsr_example which, when loading, prints out the segment registers and the segment:offset of the next hook in memory. It's combined with a "callhook" program that simply runs int 0x66 to invoke it. So let's load it and see what happens:

C:\> tsr_demo
DOS loaded the COM with this:
CS: 0C9C
DS: 0C9C
SS: 0C9C
 

When our TSR is loaded, it reads and dumps out the segment registers, showing DOS loaded us at 0C9C. For COM files (or any executable that is 'tiny'), CS=DS=SS. When DOS loads a COM executable, the entire thing is copied into memory, CS/DS/SS are set to the execution point, and the process far calls to CS:0100 to begin execution. If we check our memory usage, we can see that it has dropped:

C:\>mem

Memory Type         Total     Used     Free
---------------- -------- -------- --------
Conventional         639K     115K     524K

NOTE: It shouldn't be using 50 kiB of RAM per run; the binary is only 324 bytes! I think I'm calculating the paragraphs-to-preserve number wrong, but I didn't get a chance to fix it by time this article went up. If someone wants to look at the code, check tsr_examine.asm; the TSR int call is at the very bottom of the file and based off example code I found elsewhere.

If we run callhook, we can determine that our TSR in fact installed successfully, and the previous hook is at 0000:0000 (which is skipped over).

C:\>callhook
CS: 0C9C
DS: 0C9C
SS: 1CB6
Previous hook is at 0000:0000

Note that SS is different. When a TSR is invoked (in this case by doing int 0x66), it inherits the running state of whatever application that was running at the time. It's the responsibility of the TSR to put the stack back the way it found it when it exits, else you'll cause random corruption in userspace applications.

Now lets look at see what happens if we invoke our TSR multiple times:

C:\>tsr_demo
DOS loaded the COM with this:
CS: 1CB6
DS: 1CB6
SS: 1CB6
C:\>tsr_demo
DOS loaded the COM with this:
CS: 2CD0
DS: 2CD0
SS: 2CD0
C:\>tsr_demo
DOS loaded the COM with this:
CS: 3CEA
DS: 3CEA
SS: 3CEA

With each load, we're loading higher in memory. DOS does not automatically rebase or relocate TSRs; they stay at whatever memory segment they were in when they terminated. As DOS automatically loads COM files as low as possible, each run is loaded at the next "available" section of RAM. Calling mem shows that our available conventional memory has dropped

C:\>mem

Memory Type Total Used Free
---------------- -------- -------- --------
Conventional 639K 308K 331K

So what now happens if we run callhook?

C:\>callhook
CS: 3CEA
DS: 3CEA
SS: 4D04
Previous hook is at 2CD0:0103
CS: 2CD0
DS: 2CD0
SS: 4D04
Previous hook is at 1CB6:0103
CS: 1CB6
DS: 1CB6
SS: 4D04
Previous hook is at 0C9C:0103
CS: 0C9C
DS: 0C9C
SS: 4D04
Previous hook is at 0000:0000

We chain through each version of the TSR, easily visible by CS/DS changing as we go upwards until we reach the 'stop' at 0000:0000. At this point, I think we have a fairly good grasp on how TSRs work in practice, what DOS gives us, and how interrupts work more in-depth. At this point, this article is already past the 4k word mark, so I'm going to cut this off here before the editors stage a revolution. So let me close this off with the fact I need some help with the community.

Help Wanted

As I mentioned in Part 1, for getting the keylogged data out of the system, I'm interested in using a non-TCP/IP based protocol. Up until the mid-90s, IPX and NetBIOS-only networks were still relatively common, and it wasn't until the domination of the 'modern' internet that TCP/IP became ubiquitous. After considerable amounts of research, I've decided that fitting in the theme of 'unusual yet neat', I'd like to extract the data out using AX.25 and ham radio equipment. The other alternative I may do is using IPX, as I found the original DOOM source code actually has a complete IPX driver on it. As of right now, I'm somewhat torn between doing this with AX.25 or IPX. The thing is though, I'm going to need some help to make AX.25-based keylogger a reality.

The use of standard radio would allow the keylogger to work on air-gapped computers and show how a potential exfiltration of data might have been done in environments predating TCP/IP. It would be fairly easy to modify a standard PC to hide a 2m or 70cm transmitter within the case and connect to it via I/O lines in an early form of the NSA's current Tailored Access Operations. It would also mean that the keylogger itself would be fairly useless for use in real-life which aids the goal of preventing proliferation of attack tools.

The problem is right now, I have a serious lack of equipment. While I'm a licensed ham in the United States (KD2JRT/Technician), the only equipment I have are two Baofeng UV-82s. What I need is to figure out a decent way to handle getting data broadcasted. I know it's at least theoretically possible to build a cable to hook the Baofengs up to a computer's mic/sound in, and use a software TNC (Terminal-Node Connector) to do AX.25. By doing so, I could simply connect the TNC to VirtualBox's serial port emulation, and “blamo”, AX.25 for DOS.

What I need from the community is two-fold:

  • Experience with doing AX.25 data in real life
  • Help building the necessary cables *or* loaning radio equipment with hardware TNCs on it

I'm currently in New York City for the foreseeable future. I could potentially build cables for my Baofengs myself but I don't currently have a soldering iron, and my living situation makes it rather difficult to do electronics work here. Depending on pricing, I can probably cover shipping and handling, or compensate out-of-pocket work done by a community member. If you're interested in helping, post a comment or send me an email (mcasadevall@soylentnews.org), and I'll be in touch.

Finally, if you've enjoyed this article, please consider subscribing or gifting a subscription. No account is required, as you can anonymously gift subscriptions to my alt-account, mcasadevall (6). I'm hoping we can raise enough money to fully pay off the stakeholders of the site, and perhaps get a small budget together to let me dedicate more time to content like this, or buy equipment to explore more obscure pieces of hardware (i.e., digging into doing some INIT coding on classic Mac OS, or something of that nature). I'd like to give thanks to all those subscribed, including jimtheowl after the previous article.

And with that, 73 de NCommander!

posted by cmn32480 on Friday September 16 2016, @10:23AM   Printer-friendly
from the blame-the-mighty-buzzard dept.

We have been informed by Linode (which hosts our servers) that there is some hardware maintenance being performed tonight. The impacted servers are 'fluorine' and 'neon'. Here is the message we received:

Linode continuously monitors the health of our equipment and we've been alerted to a condition which affects the physical server on which your Linode is hosted. While we have determined that this is not an emergency, this should be addressed in order to optimize the performance of your Linode. We have scheduled a maintenance window for the physical server on which your Linode is hosted:

Friday, September 16, 2016 at 1:00 AM EDT (5:00 AM UTC)

Downtime from this maintenance is expected to be no more than 1 hour. Please note, however, that the entire maintenance window may be required. Your Linode will be gracefully powered down and rebooted during the maintenance. Services not configured to start on a reboot will need to be manually started. If this time frame does not work for you, you have the option of migrating to another host which has these settings enabled.

Thanks to redundancy between our front end and database servers, our main site should remain functional. There will, however, be some minor inconveniences. During this period:

  • we will not be able to process credit card payments,
  • comment counts will not update, and
  • emails and web notifications will not go out.

We appreciate your understanding and patience while the servers are being serviced.

We anticipate most of the affected services should auto-restart; those that do not will be addressed starting around 0600 CDT (0700 EDT / 1100 UTC).

UPDATE: All is shiny and happy again.


Original Submission

posted by NCommander on Tuesday August 30 2016, @12:14PM   Printer-friendly
from the int-21h-is-how-cool-kids-did-it dept.

I've made no secret that I'd like to bring original content to SoylentNews, and recently polled the community on their feelings for crowdfunding articles. The overall response was somewhat lukewarm mostly on dividing where money and paying authors. As such, taking that into account, I decided to write a series of articles for SN in an attempt to drive more subscriptions and readers to the site, and to scratch a personal itch on doing a retro-computing project. The question then became: What to write?

As part of a conversation on IRC, part of me wondered what a modern day keylogger would have looked running on DOS. In the world of 2016, its no secret that various three letter agencies engage in mass surveillance and cyberwarfare. A keylogger would be part of any basic set of attack tools. The question is what would a potential attack tool have looked like if it was written during the 1980s. Back in 1980, the world was a very different place both from a networking and programming perspective.

For example, in 1988 (the year I was born), the IBM PC/XT and AT would have been a relatively common fixture, and the PS/2 only recently released. Most of the personal computing market ran some version of DOS, networking (which was rare) frequently took the form of Token Ring or ARCNet equipment. Further up the stack, TCP/IP competed with IPX, NetBIOS, and several other protocols for dominance. From the programming side, coding for DOS is very different that any modern platform as you had to deal with Intel's segmented architecture, and interacting directly with both the BIOS, and hardware. As such its an interesting look at how technology has evolved since.

Now obviously, I don't want to release a ready-made attack tool to be abused for the masses especially since DOS is still frequently used in embedded and industry roles. As such, I'm going to target a non-IP based protocol for logging both to explore these technologies, while simultaneously making it as useless as possible. To the extent possible, I will try and keep everything accessible to non-programmers, but this isn't intended as a tutorial for real mode programming. As such I'm not going to go super in-depth in places, but will try to link relevant information. If anyone is confused, post a comment, and I'll answer questions or edit these articles as they go live.

More past the break ...

Looking At Our Target

Back in 1984, IBM released the Personal Computer/AT which can be seen as the common ancestor of all modern PCs. Clone manufacturers copied the basic hardware and software interfaces which made the AT, and created the concept of PC-compatible software. Due to the sheer proliferation of both the AT and its clones, these interfaces became a de-facto standard which continues to this very day. As such, well-written software for the AT can generally be run on modern PCs with a minimum of hassle, and it is completely possible to run ancient versions of DOS and OS/2 on modern hardware due to backwards compatibility.

A typical business PC of the era likely looked something like this:

  • An Intel 8086 or 80286 processor running at 4-6 MHz
  • 256 kilobytes to 1 megabyte of RAM
  • 5-20 MiB HDD + 5.25 floppy disk drive
  • Operating System: DOS 3.x or OS/2 1.x
  • Network: Token Ring connected to a NetWare server, or OS/2 LAN Manager
  • Cost: ~$6000 USD in 1987

To put that in perspective, many of today's microcontrollers have on-par or better specifications than the original PC/AT. From a programming perspective, even taking into account resource limitations, coding for the PC/AT is drastically different from many modern systems due to the segmented memory model used by the 8086 and 80286. Before we dive into the nitty-gritty of a basic 'Hello World' program, we need to take a closer look at the programming model and memory architecture used by the 8086 which was a 16-bit processor.

Real Mode Programming

If the AT is the common ancestor of all PC-compatibles, then the Intel 8086 is processor equivalent. The 8086 was a 16-bit processor that operated at a top clock speed of 10 MHz, had a 20-bit address bus that supported up to 1 megabyte of RAM, and provided fourteen registers. Registers are essentially very fast storage locations physically located within the processor that were used to perform various operations. Four registers (AX, BX, CX, and DX) are general purpose, meaning they can be used for any operation. Eight (described below) are dedicated to working with segments, and the final registers are the processor's current instruction pointer (IP), and state (FLAGS) An important point in understanding the differences between modern programming environments and those used by early PCs deals with the difference between 16-bit and 32/64-bit programming. At the most fundamental level, the number of bits a processor has refers to the size of numbers (or integers) it works with internally. As such, the largest possible unsigned number a 16-bit processor can directly work with is 2 to the power of 16 (minus 1) or 65,535. As the name suggests, 32-bit processors work with larger numbers, with the maximum being 4,294,967,296. Thus, a 16-bit processor can only reference up to 64 KiB of memory at a given time while a 32-bit processor can reference up to 4 GiB, and a 64-bit processor can reference up to 16 exbibytes of memory directly.

At this point, you may be asking yourselves, "if a 16-bit processor could only work with 64 KiB RAM directly, how did the the 8086 support up to 1 megabyte?" The answer comes from the segmented memory model. Instead of directly referencing a location in RAM, addresses were divided into two 16-bit parts, the selector and offset. Segments are 64 kilobyte selections of RAM. They could generally be considered the computing equivalent of a postal code, telling the processor where to look for data. The offset then told the processor where exactly within that segment the data it wanted was located. On the 8086, the selector represented the top 16-bits of an address, and then the offset was added to it to create 20-bits (or 1 megabyte) of addressable memory. Segments and offsets are referenced by the processor in special registers; in short you had the following:

  • Segments
    • CS: Code segment - Application code
    • DS: Data segment - Application data
    • SS: Stack segment - Stack (or working space) location
    • ES: Extra segment - Programmer defined 'spare' segment
  • Offsets
    • SI - Source Index
    • DI - Destination Index
    • BP - Base pointer
    • SP - Stack pointer

As such, memory addresses on the 8086 were written in the form of segment:offset. For example, a given memory address of 0x000FFFFF could be written as F000:FFFF. As a consequence, multiple segment:offset pairs could refer to the same bit of memory; the addresses F555:AAAF, F000:FFFF, and F800:7FFF all refer to the same bit of memory. The segmentation model also had important performance and operational characteristics to consider.

The most important was that since data could be within the same segment, or a different type of segment, you had two different types of pointers to work with them. Near pointers (which is just the 16-bit offset) deal with data within the same segment, and are very fast as no state information has to be changed to reference them. Far pointers pointed to data in a different selector and required multiple operations to work with as you had to not only load and store the two 16-bit components, you had to change the segment registers to the correct values. In practice, that meant far pointers were extremely costly in terms of execution time. The performance hit was bad enough that it eventually lead to one of the greatest (or worst) backward compatibility hacks of all time: the A20 gate, something which I could write a whole article on.

The segmented memory model also meant that any high level programming languages had to incorporate lower-level programming details into it. For example, while C compilers were available for the 8086 (in the form on Microsoft C), the C programming language had to be modified to work with the memory model. This meant that instead of just having the standard C pointer types, you had to deal with near and far pointers, and the layout of data and code within segments to make the whole thing work. This meant that coding for pre-80386 processors required code specifically written for the 8086 and the 80286.

Furthermore, most of the functionality provided by the BIOS and DOS were only available in the form of interrupts. Interrupts are special signals used by the process that something needs immediate attention; for examine, typing a key on a keyboard generates a IRQ 1 interrupt to let DOS and applications know something happened. Interrupts can be generated in software (the 'int' instruction) or hardware. As interrupt handling can generally only be done in raw assembly, many DOS apps of the era were written (in whole or in part) in intel assembly. This brings us to our next topic: the DOS programming model

Disassembling 'Hello World'

Before digging more into the subject, let's look at the traditional 'Hello World' program written for DOS. All code posted here is compiled with NASM

; Hello.asm - Hello World

section .text
org 0x100

_entry:
 mov ah, 9
 mov dx, str_hello
 int 0x21
 ret

section .data
str_hello: db "Hello World",'$'

Pretty, right? Even for those familiar with 32-bit x86 assembly programming may not be able to understand this at first glance what this does. To prevent this from getting too long, I'm going to gloss over the specifics of how DOS loads programs, and simply what this does. For non-programmers, this may be confusing, but I'll try an explain it below.

The first part of the file has the code segment (marked 'section .text' in NASM) and our program's entry point. With COM files such as this, execution begins at the top of file. As such, _entry is where we enter the program. We immediately execute two 'mov' instructions to load values into the top half of AX (AH), and a near pointer to our string into DX. Ignore 9 for now, we'll get to it in a moment. Afterwords, we trip an interrupt, with the number in hex (0x21) after it being the interrupt we want to trip. DOS's functions are exposed as interrupts on 0x20 to 0x2F; 0x21 is roughly equivalent to stdio in C. 0x21 uses the value in AX to determine which subfunction we want, in this case, 9, to write to console. DOS expects a string terminated in $ in DX; it does not use null-terminated strings like you may expect. After we return from the interrupt, we simply exit the program by calling ret.

Under DOS, there is no standard library with nicely named functions to help you out of the box (though many compilers did ship with these such as Watcom C). Instead, you have to load values into registers, and call the correct interrupt to make anything happen. Fortunately, lists of known interrupts are available to make the process less painful. Furthermore, DOS only provides filesystem and network operations. For anything else, you need to talk to the BIOS or hardware directly. The best way to think of DOS from a programming perspective is essentially an extension of the basic input/output functionality that IBM provided in ROM rather than a full operating system.

We'll dig more into the specifics on future articles, but the takeaway here is that if you want to do anything in DOS, interrupts and reference tables are the only way to do so.

Conclusion

As an introduction article, we looked at the basics of how 16-bit real mode programming works and the DOS programming model. While something of a dry read, it's a necessary foundation to understand the basic building blocks of what is to come. In the next article, we'll look more at the DOS API, and terminate-and-stay resident programs, as well as hooking interrupts.

posted by martyb on Saturday August 27 2016, @03:38PM   Printer-friendly
from the Can-we-make-the-Top-400-by-Halloween? dept.

It has only been six short months since SoylentNews' Folding@Home team was founded, and we've made a major milestone: our team is now one of the top 500 teams in the world! We've already surpassed some heavy hitters like /. and several universities, including MIT. (But now is not the time to rest on our laurels. A certain Redmond-based software producer currently occupies #442.)

In case you aren't familiar with folding@home, it's a distributed computing project that simulates protein folding in an attempt to better understand diseases such as Alzheimer's and Huntington's and thereby help to find a cure. To that end, SoylentNews' team has completed nearly 16,000 work units.

If you'd like to contribute to our team by donating some spare CPU/GPU cycles, you can get started here. There are clients available for Linux, Windows, and OSX. Once you have installed the software, enter the TeamID 230319 to join us.

Feel free to join #folding on our IRC channel if you need any help, or just want to chat.

Thank you to all that have participated, and a special thanks to our top 10 folders:

  1. cmn32480
  2. Runaway1956
  3. Beldin65
  4. tibman
  5. LTKKane
  6. EricAlbers_ericalbers_com
  7. Kymation
  8. meisterister
  9. kurenai.tsubasa
  10. NotSanguine

Related Links:
http://folding.stanford.edu
http://fah-web.stanford.edu/cgi-bin/main.py?qtype=teampage&teamnum=230319


Original Submission

posted by NCommander on Thursday August 25 2016, @01:00PM   Printer-friendly
from the you-can-haz-RRSIG dept.

In the ongoing battle of site improvements and shoring up security, I finally managed to scratch a long-standing itch and signed the soylentnews.org domain. As of right now, our chain is fully validated and pushed to all our end-points.

Right now, I'm getting ready to dig in with TheMightyBuzzard to work on improving XSS protection for the site, and starting to lay out new site features (which will be in a future post). As with any meta post, I'll be reading your comments below.

~ NCommander

posted by martyb on Thursday August 25 2016, @12:17PM   Printer-friendly
from the subject-to-review dept.

Are subjects passé in comments on the post-social media web? Or are they a valid feature to enable human eye-scanning and relevant search results?

It is the opinion of this anonymous submitter that putting "Subjects are an anachronism" [1] or "SubjectsinCommentsareStupid" [2] is unhelpful at best and spam at worst. SoylentNews has a long legacy going back to Chips & Dips, the predecessor site to Slashdot (from whose code SoylentNews was forked).

With that in mind, subjects are not a vestigial feature but a useful and defining one. It makes longer threads friendly to readers, and separates this site from Digg, Reddit, Voat, and so many other disposable social media sites. Just as email would be worse without subjects, so too would SoylentNews.

Taken from actual SoylentNews comments; cf: [1] and [2].

Ed Note: I'm of two minds as to running this story. This is presented as one person's opinion and makes a case for continuing to have a Subject for each comment. As noted, others do not feel the same way. As SoylentNews is a community, your input guides us. So, what say you? Should we continue as-is? Make subjects optional? Dispense with them entirely? Other? What benefits and/or problems are likely to result?


Original Submission

posted by NCommander on Wednesday August 17 2016, @11:23AM   Printer-friendly
from the hrm dept.

So, during the last site update article, a discussion came up talking about how those who work and write for this site should get paid for said work. I've always wanted to get us to the point where we could cut a check to the contributors of SoylentNews, but as it stands, subscriptions more or less let us keep the lights on and that's about it.

As I was writing and responding to one specific thread, part of me started to wonder if there would be enough interest to try and crowdfund articles on specific topics. In general, meta articles in which we talk deploying HSTS or our use of Hesiod tend to generate a lot of interest. So, I wanted to try and see if there was an opportunity to both generate interesting content, and help get some funds back to those who donate their time to keep the lights on.

One idea that immediately comes to mind that I could write is deploying DNSSEC in the real world, and an active example of how it can help mitigate hijack attacks against misconfigured domains. Alternatively, on a retro-computing angle, I could cook something in 16-bit real mode assembly that can load an article from soylentnews.org. I could also do a series on doing (mostly) bare metal work; i.e., loading an article from PXE boot or UEFI.

However, before I get in too deep into building this idea, I want to see how the community feels about it. My initial thought is that the funds raised for a given article would dictate how long it would be, and the revenue would be split between the author, and the staff, with the staff section being divided at the end of the year as even as possible. The program would be open to any SN contributor. If the community is both interested and willing, I'll organize a staff meeting and we'll do a trial run to see if the idea is viable. If it flies, then we'll build out the system to be a semi-regular feature of the site

As always, leave your comments below, and we'll all be reading ...

~ NCommander

posted by NCommander on Monday August 15 2016, @07:01PM   Printer-friendly
from the fiddling-for-the-greater-good dept.

Since people seem to rather enjoy when I run articles on backend upgrades, here's another set of changes I made over the last week as I get back into the full swing working on the site.

The short list:

  • Migrated Beryllium (which hosts wiki+IRC+mail) to Apache 2.4
    • Upgraded said machine to PHP7
    • Needed to support OCSP stapling
    • Validating final checks before deploying HSTS to all public domains
    • Upgraded MediaWiki, SquirrelMail, and YOURLS to PHP7 compatible versions
  • Worked with TheMightyBuzzard and user comments to determine additional XSS protection headers we should deploy
  • Found (and removed) SSLv3 support on postfix and dovecot
  • Deployed DNSSEC on sylnt.us in preparation for signing soylentnews.org (here's the test results)

Read past the fold for more information.

Beryllium Upgrades

Beryllium is our "misc" services box. It basically hosts everything that isn't related to site infrastructure such as the wiki, our IRC server, and mail. Last week, I went through and fixed our SSL configuration on this machine to make sure that we were serving properly validated certificates, and that we had strong encryption on this box. While I succeeded on that front, for performance reasons, Apache 2.4 needed to be upgraded to support a somewhat obscure feature of TLS known as OCSP stapling.

What is OCSP stapling you ask? Well, to answer that, I need to take a moment to go into how SSL certificates work. Whenever a CA generates a certificate, they're essentially saying "this site is who it is and we're attesting to it". In a perfect world, a CA would never make a mistake, private keys would never leak, and we could always assume that a certificate is good. We don't live in that world, as such certificate authorities sometimes need to void a certificate. OCSP (which stands for the Online Certificate Status Protocol) is one of two ways to do this, and is the only method Let's Encrypt supports for certificate revocation.

OCSP is a replacement for older certificate revocation lists (CRLs) which in real-life rarely if ever worked as advertised. It's meant to allow the browser to update in real-time knowledge if a certificate is good or bad and react accordingly. OCSP however requires that the browser checks with a certificate authority's OCSP server, leaking the fact that user X is connecting to site Y. It also means that if access to the OCSP server is blocked, a user might not be aware that a certificate has been revoked. OCSP stapling solves both problems by having our servers grab the OCSP reply (which is timestamped), and sending it as part of the initial connection to our site, both increasing performance, and preventing a privacy leak.

Unfortunately, OCSP stapling requiring Apache 2.4 which required me to build it from source, and then migrate sites over from the older Apache 2.2 install. At the same time, I went through and upgraded PHP 7, and updated the other web applications we were using. For the most part, this was rather painless though I'm still tinkering with MediaWiki to make it happy on the new setup.

Beside the usual Apache pain, I went through and scanned our other major services and disabled SSLv3 support on postfix (SMTP) and dovecot. I need to go through and replace our self-signed certs with real ones here but that's a 'one step at a time thing'

XSS Mitigation

During the last site status article, an AC pointed us at this handy site showing security headers. As such, TheMightyBuzzard and I will be going through and enabling these (with the exception of public key pinning) on production sometime this week. HPKP requires quite a bit of planning to deploy and we're not ready to take that step just yet.

DNSSEC + sylnt.us

I've talked about wanted to deploy DNSSEC before, but various other things kept cropping up. That, and combined with outdated and misleading documentation kept me from actually getting around to doing this for ages. Over the weekend, I finally dug down and figured out the current best practices for DNSSEC, and with the help of audioguy, configured BIND to do automatic signing of the domain and uploaded our keys to our register.

As such, sylnt.us now has a fully validated signature chain, and a green key when checked with the DNSSEC validator. We will be signing soylentnews.org sometime in the near future, however, we ran into some DNS zone transfer issues between our nameservers and Linode which caused the RRSIG records to not properly upload. While this has been resolved for now, we're currently talking with Linode to understand why the transfer went pear-shapped and to prevent a second occurrence.

That's it for now. As always, post questions, comments below. I'll be reading!

~ NCommander

posted by NCommander on Monday August 08 2016, @12:00PM   Printer-friendly
from the now-with-a+-scores dept.

So after an extended period of inactivity, I've finally decided to jump back into working on SoylentNews and rehash (the code that powers the site). As such, I've decided to scratch some long-standing itches. The first (and easiest) to deploy was HSTS to SoylentNews. What is HSTS you may ask?

HSTS stands for HTTP Strict Transport Security and is a special HTTP header that signifies that a site should only be connected to over HTTPS and causes the browser to automatically load encrypted versions of a website should it see a regular URL. We've forbid non-SSL connections to SN for over a year, but without HSTS in place, a man-in-the-middle downgrade attack was possible by intercepting the initial insecure page load.

One of the big views I have towards SoylentNews is we should be representative of "best practices" on the internet. To that end, we deployed IPv6 publicly last year, and went HTTPS-by-default not long after that. Deploying HSTS continues this trend, and I'm working towards implementing other good ideas that rarely seem to see the light of day.

Check past the break for more technical details.

[Continues...]

As part of prepping for HSTS deployment, I went through every site in our public DNS records, and made sure they all have valid SSL certificates, and are redirecting to HTTPS by default. Much to my embarrassment, I found that several of our public facing sites lacked SSL support at all, or had self-signed certificates and broken SSL configurations. This has been rectified.

Let this be a lesson to everyone. While protecting your "main site" is always a good idea, make sure when going through and securing your infrastructure that you check every public IP and public hostname to make sure something didn't slip through the gaps. If you're running SSLLabs against your website, I highly recommend you scan all the subjectAlternativeNames listed in your certificate. Apache and nginx can provide different SSL options for different VHosts, and its very important to make sure all of them have a sane and consistent configuration.

Right now, HSTS is deployed only on the main site, without "includeSubdomains". The reason for this is I wanted to make sure I didn't miss any non-SSL capable sites, and I'm still working on getting our CentOS 6.7 box up to best-practices (unfortunately, the version of Apache it ships with is rather dated and doesn't support OSCP stapling. I'll be fixing this, but just haven't gotten around to it yet).

Once I've fixed that, and am happy with the state of the site, SN, and her subdomains will be submitted for inclusion into browser preload lists. I'll run an article when that submission happens and when we're accepted. I hope to have another article this week on backend tinkering and proposed site updates.

Until then, happy hacking!
~ NCommander